AI Content Red Team Analyst - Trust and Safety Job San Jose area,California USA,IT/Tech

AI Content Red Team Analyst - Trust and Safety

Location:

San Jose

Employment Type:

Regular

Job Code: A163404

Responsibilities

The Trust & Safety (T&S) GenAI & Emerging Product team's mission is to empower the development of GenAI models and applications. We do this by building a world‑class safety, testing, and risk management system that ensures GenAI innovations are launched responsibly.

We probe models and product experiences across modalities, use cases, and abuse patterns to identify failure modes, stress‑test safeguards, and help teams improve safety before and after launch. We work closely with Trust & Safety teams (policy, product, engineering, data science, operations), and business teams across global markets. Success in this team requires strong judgment, creativity, analytical rigor, and the ability to translate ambiguous findings into actionable recommendations.

Conduct structured adversarial testing on AI models, features, and policies to identify vulnerabilities and emerging risks.
Explore product behavior across contexts and user journeys, to identify model failure modes that may not be captured in standard evaluations.
Investigate jailbreaks, evasions, prompt‑based attacks, and other adversarial techniques relevant to content safety.
Document findings clearly and consistently, including risk descriptions, reproduction steps, severity assessments, and mitigation recommendations.
Partner with cross‑functional stakeholders (policy, product, business teams) to ensure mitigation validation and root cause closure.
Support development of testing playbooks, taxonomies, and internal knowledge bases.
Stay updated on emerging adversarial trends (e.g., deepfakes, multimodal manipulation, coordinated abuse), and shifts in the external risk landscape.

Qualifications

Minimum Qualification(s):

Minimum 3 years of experience in Trust & Safety, cybersecurity, risk/adversarial testing, or related fields.
Experience with prompt testing, jailbreak analysis, LLM evaluation, or adversarial QA.
Familiarity with AI safety risks (jailbreaks, hallucinations, bias, misuse patterns).
Strong interest in GenAI safety, and the ways AI systems can be compromised under adversarial conditions.
Demonstrated ability to independently investigate ambiguous problems, identify non‑obvious failure modes and abuse patterns, and produce clear, evidence‑based conclusions.
Ability to manage multiple priorities, and collaborate effectively with cross‑functional teams.

Preferred Qualification(s):

Experience working with agentic AI tools to scale your impact, including building/operating AI tools to make processes efficient and effective.

Job Information

The base salary range for this position in the selected city is $108,800 - $288,000 annually.

Compensation may vary outside of this range depending on a number of factors, including a candidate’s qualifications, skills, competencies and experience, and location. Base pay is one part of the Total Package that is provided to compensate and recognize employees for their work, and this role may be eligible for additional discretionary bonuses/incentives, and restricted stock units.

Benefits may vary depending on the nature of employment and the country work location. Employees have day one access to medical, dental, and vision insurance, a 401(k) savings plan with company match, paid parental leave, short‑term and long‑term disability coverage, life insurance, wellbeing benefits, among others. Employees also receive 10 paid holidays per year, 10 paid sick days per year and 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure).

The Company reserves the right to modify or change these benefits programs at any time, with or without notice.

For Los Angeles County (unincorporated) Candidates:

Interacting and occasionally having unsupervised contact with internal/external clients and/or colleagues;

Appropriately handling and managing confidential information including proprietary and trade secret information and access to information technology systems;

Exercising sound judgment.

Trust & Safety

Content that this role interacts with includes images, video, and text related to everyday life, but it can also include (but is not limited to) bullying; hate speech; child safety; depictions of harm to self and others, and harm to animals. Hence, it is possible that this role will be exposed to harmful content on a daily basis.

We are committed to policies and controls to keep our platform safe for the Tik Tok communities. The role may involve facing emotionally demanding content; we provide comprehensive support to promote physical and mental wellbeing throughout each employee’s journey with us.

#J-18808-Ljbffr