×
Register Here to Apply for Jobs or Post Jobs. X

AI Quality Engineer

Job in Atlanta, Fulton County, Georgia, 30301, USA
Listing for: Abila
Full Time position
Listed on 2026-06-02
Job specializations:
  • Software Development
    AI Engineer, Machine Learning/ ML Engineer, Data Scientist
Job Description & How to Apply Below
Job Description

Key Responsibilities

* Design and implement evaluation frameworks (evals) to assess LLM and agentic AI system quality, including accuracy, consistency, safety, and task completion rates.

* Build and maintain automated test pipelines for AI features, covering unit, integration, and end-to-end scenarios across agentic workflows.

* Develop tooling to detect regressions in model behavior, prompt outputs, and agent decision-making across releases.

* Define and track quality metrics for AI systems (e.g., hallucination rates, tool-use accuracy, latency, failure recovery) and surface findings clearly to stakeholders.

* Collaborate with engineers and product managers to identify edge cases, adversarial inputs, and failure modes specific to multi-step agentic pipelines.

* Contribute to prompt evaluation strategies, including red-teaming, adversarial testing, and bias/fairness assessments.

* Participate in design and code reviews with a quality-focused lens, raising concerns about testability and reliability early.

* Help define and document quality standards and best practices for AI/ML features across the team.

* Other duties as assigned.

Qualifications

Required

* Bachelor's degree in Computer Science, Engineering, or equivalent practical experience.

* 3-5 years of professional software engineering or quality engineering experience.

* Hands-on experience working with LLMs or agentic AI systems (e.g., GPT-4, Claude, Gemini, or open-source models).

* Proficiency in Python for scripting, test automation, and data analysis.

* Experience designing and running evaluations (evals) for generative AI or LLM-powered features.

* Solid understanding of software testing principles: unit, integration, regression, and end-to-end testing.

* Familiarity with agentic frameworks and concepts (e.g., tool use, multi-step reasoning, retrieval-augmented generation, memory).

* Experience with CI/CD pipelines and integrating automated tests into development workflows.

* Strong analytical skills - able to interpret probabilistic outputs and distinguish meaningful regressions from expected variance.

* Strong written and verbal communication skills; ability to clearly document findings and present quality data to non-technical stakeholders.

* Detail-oriented, with a structured approach to exploring edge cases and failure scenarios.

* Ability to work in a fast-paced environment and manage multiple priorities effectively.

Nice to Have

* Experience with prompt engineering and systematic prompt evaluation methodologies.

* Familiarity with AI safety, alignment, or responsible AI concepts (e.g., hallucination mitigation, bias detection, guardrails).

* Exposure to agentic orchestration frameworks (e.g., Lang Chain, Lang Graph, Auto Gen, CrewAI, or similar).

* Experience with vector databases or RAG pipelines (e.g., Pinecone, Weaviate, pgvector).

* Knowledge of observability and monitoring tools for AI systems (e.g., Lang Smith, Weights & Biases, Arize).

* Background in data science or ML experimentation practices.

* Experience with version control systems (Git) and defect-tracking tools (e.g., Jira).

* Exposure to cloud platforms (e.g., AWS, Azure, GCP) in the context of deploying or testing AI services.

What Success Looks Like

* Builds robust eval frameworks that catch meaningful regressions in AI behavior before they reach production.

* Reduces time-to-detection for quality issues in agentic workflows through effective automation and monitoring.

* Contributes clear, actionable quality signals that help the team make confident release decisions.

Grows into a trusted voice on AI quality standards, influencing engineering practices across the team.

#LI-MH1 #momentivesoftware

About Us

Momentive Software amplifies the impact of over 20,000 purpose-driven organizations in over 30 countries, with over $11 billion raised and 55 million members served to date. Mission-driven nonprofits and associations rely on Momentive's cloud-based software and services to address their most pressing challenges - from engaging their communities to simplifying operations and growing revenue. Designed to help organizations connect more, manage more, and ultimately expect more,…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary