×
Register Here to Apply for Jobs or Post Jobs. X

US_East | Software Developer - Testing Tools​/Automation​/Performance _L

Job in Plano, Collin County, Texas, 75086, USA
Listing for: Expedite Talent Solutions
Full Time position
Listed on 2026-05-09
Job specializations:
  • IT/Tech
    AI Engineer, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Position: US_East | Software Developer - Testing Tools/Automation/Performance _L2

V&V Engineer – AI-Driven Testing & Validation

Location:

Plano, TX

Key Responsibilities
  • Lead end-to-end quality engineering for enterprise AI applications, including LLM-powered products, RAG pipelines, and agentic workflows.
  • Design and execute prompt validation strategies, evaluating LLM responses for accuracy, semantic relevance, hallucination risk, and safety compliance.
  • Build automated evaluation pipelines for AI model outputs using metrics such as BLEU, ROUGE, embedding-based similarity, precision, recall, and F1-score.
  • Validate agentic systems (tool use, multi-step reasoning, planner-executor workflows) for correctness, determinism, and failure mode handling.
  • Architect and maintain Python-based automation frameworks for AI/ML model evaluation, regression testing, and continuous model quality monitoring.
  • Integrate AI testing into CI/CD pipelines, enabling automated evaluation of model updates, prompt changes, and dataset revisions before release.
  • Develop reusable test harnesses for prompt regression, golden-set evaluation, A/B comparison of model versions, and human-in-the-loop review workflows.
  • Perform AI data validation across training and inference pipelines using exploratory data analysis (EDA), schema validation, and cross-validation techniques.
  • Conduct bias detection and fairness analysis across demographic and contextual slices to ensure responsible AI outcomes.
  • Drive model robustness testing, including adversarial inputs, distribution shift detection, and stress testing under edge cases.
  • Establish regression testing standards for retraining and fine-tuning cycles to prevent quality drift after model updates.
  • Partner with client AI engineers to validate solutions built using Tensor Flow, PyTorch, Lang Chain, Lang Graph, and Llama Index.
  • Define quality KPIs and acceptance criteria for AI features, and report quality posture to engineering and product leadership.
  • Mentor QA engineers on AI evaluation methodologies, ML fundamentals, and modern test automation practices.
  • Champion responsible AI practices, including safety, transparency, explainability, and compliance with evolving AI governance standards.
Required Qualifications
  • 10+ years of professional experience in Quality Engineering and Test Automation, validating complex enterprise applications.
  • Proficient in validating AI/ML systems, including Generative AI and LLM-based applications.
  • Strong proficiency in Python and experience building automation frameworks from the ground up.
  • Practical experience with prompt validation, agentic workflow testing, and AI model evaluation.
  • Working knowledge of evaluation metrics: BLEU, ROUGE, embedding similarity, precision, recall, F1-score, and human-evaluation methodologies.
  • Experience with AI/ML frameworks and ecosystems:
    Tensor Flow, PyTorch, Lang Chain, Lang Graph, and Llama Index.
  • Solid understanding of data validation techniques: EDA, schema validation, cross-validation, and statistical analysis.
  • Experience integrating automated testing into CI/CD pipelines (e.g., Git Hub Actions, Jenkins, Git Lab CI, Azure Dev Ops).
  • Familiarity with bias detection, fairness assessment, and AI safety evaluation techniques.
  • Bachelor's or Master's degree in Computer Science, Data Science, or a related technical field.
Preferred Qualifications
  • Experience with vector databases, retrieval-augmented generation (RAG), and embedding pipelines.
  • Background in MLOps tooling such as MLflow, Weights & Biases, or similar experiment tracking platforms.
  • Exposure to LLM observability and evaluation tools (e.g., Lang Smith, Ragas, Deep Eval, Tru Lens).
  • Familiarity with cloud AI services on AWS, Azure, or GCP (Bedrock, Azure OpenAI, Vertex AI).
  • Knowledge of AI governance frameworks, model cards, and emerging AI regulatory standards.
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary