Jailbreaking Lead,Red Team Job Berkeley area,California USA,IT/Tech

FAR.AI is seeking a Jail breaking Lead (Red Team) whose personal mission and obsession is to jailbreak the world's leading frontier AI models. You will sit at the tip of the spear of one of the world's leading AI red-teams, with a single, critical focus: find the universal jailbreaks that no one else can find, in the models used by hundreds of millions of people, and make sure they get fixed.

This is primarily a senior IC role with some management responsibilities, ideally for candidates who want to build and lead a jail breaking team over time. An IC-only track is also available. Either way, you will spend the majority of your time hands‑on, building attacks and breaking frontier models, and setting the technical bar for what a world‑class jailbreak looks like.

About the Role

Jail breaking is the core technical engine of the red team. As Jail breaking Lead, you own that engine. You are the person who personally breaks the hardest targets, sets the bar the rest of the team pushes toward, and makes sure we keep discovering the highest severity, universal vulnerabilities – the most important vulnerabilities to fix – in the most heavily defended frontier models on the planet, faster than anyone else.

We expect you to spend at least 50-70% of your time hands‑on across 2026: breaking models, chaining novel attack classes through defense‑in‑depth stacks, helping to invent new techniques when existing ones fail, and setting the standard for what constitutes a significant vulnerability and a credible mitigation. The remaining time will go to managing/mentoring ICs, helping to shape the jail breaking research agenda with Kellin, and making sure our findings land with frontier labs, governments, and the broader field.

The rest of the red team will empower your work, whether through direct collaboration and support, novel research and red‑teaming infrastructure, or toolkits and agent build‑outs.

This is a senior IC role by default, intended to attract a world‑class jail breaker whose personal mission is to find critical jailbreaks in the most heavily defended domains of the leading frontier AI models, and who has a track record of repeatedly doing so. We are open to a management track for candidates who want to hire and lead a jail breaking team over time.

We will not water down the IC bar to support the management track: both versions of this role require you to be, or be on a clear trajectory to being, one of the best jail breakers in the world.

In practice, this role spans:

Lead jail breaking on the highest‑stakes engagements:
- Personally develop universal and near‑universal jailbreaks against frontier closed‑ and open‑weight models, in CBRNE, cyber, agentic security, extreme persuasion, and emerging risk domains;
- Systematically dismantle defense‑in‑depth stacks (input filters, model‑level refusal and safe completion, reasoning monitors, output filters, account‑level moderation), chaining novel and established techniques;
- Escalate initial vulnerabilities to expose their most severe form, maximising universality, success rate, and capability of elicited output;
- Own the technical bar for vulnerability severity and generality on every major engagement.
Push the frontier of jail breaking techniques:
- Invent new attack classes when existing techniques fail (e.g., we have recently shipped novel attacks against Constitutional Classifiers and fine‑tuning APIs);
- Monitor and rapidly incorporate state‑of‑the‑art methods from the literature, and build our own proprietary portfolio;
- Shape the jail breaking research agenda in partnership with Kellin, ensuring our toolkit stays ahead as defences evolve;
- Stress‑test novel affordances (innovations in agents, tool use, long context, multimodal, reasoning, etc.) as frontier systems evolve.
Raise the technical bar across the team:
- Set the standard for rigour, creativity, and precision in jail breaking across the red team;
- Mentor ICs on attack craft, running pairing sessions, post‑engagement retros, and internal writeups that turn your craft into team capability;
- Review major red‑teaming deliverables for technical quality, severity judgment, and clarity;
- If on the management track: hire, manage, and…

Jailbreaking Lead, Red Team