×
Register Here to Apply for Jobs or Post Jobs. X

AI Safety & Evaluation Engineer

Remote / Online - Candidates ideally in
Campbell, Santa Clara County, California, 95011, USA
Listing for: DeWinter Group
Contract, Remote/Work from Home position
Listed on 2026-05-10
Job specializations:
  • IT/Tech
    AI Engineer, Systems Engineer
  • Engineering
    AI Engineer, Systems Engineer
Salary/Wage Range or Industry Benchmark: 50 - 175 USD Hourly USD 50.00 175.00 HOUR
Job Description & How to Apply Below

Title: AI Safety and Evaluations Engineer
Job Type: Contract
Contract Length: 12 Months
Pay Range: $50/hr – $175/hr
Start Date: ASAP
Location: Remote

About the Opportunity: Our client, a leader in AI testing and Generative AI solutions, is looking for a skilled AI Safety and Evaluations Engineer to join their team for a 12-month engagement. This project involves designing and building rigorous evaluation frameworks to measure model bias, hallucinations, and toxicity, ensuring models are safe and compliant before deployment. This is a high-impact role that requires a self-motivated professional who can hit the ground running and deliver results quickly.

Key Responsibilities & Deliverables
  • Designing and building rigorous evaluation frameworks to measure model bias, hallucinations, and toxicity.
  • Creating automated "Eval" datasets to benchmark new models before they are promoted to production.
  • Developing metrics for "Grounding" and "Faithfulness" in RAG-based systems.
  • Building monitoring tools that flag harmful or non-compliant AI outputs in real-time.
  • Partnering with legal and ethics teams to translate policy into technical safety constraints.
Required Skills & Experience
  • 3+ years of experience in AI Research or Quality Engineering.
  • Deep expertise in model evaluation techniques and NLP metrics (ROUGE, BLEU, BERTScore). This isn't a learning role—you need to be a subject matter expert.
  • Demonstrated ability to work autonomously and manage your own time effectively to meet project goals.
  • Experience with Python, data analysis tools, and LLM-as-a-Judge frameworks.
  • Strong communication skills to provide clear and concise status updates to the project team.
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary