×
Register Here to Apply for Jobs or Post Jobs. X

AI Data Engineer

Remote / Online - Candidates ideally in
Portland, Cumberland County, Maine, 04122, USA
Listing for: Veeva Systems
Remote/Work from Home position
Listed on 2025-12-22
Job specializations:
  • IT/Tech
    Data Analyst, AI Engineer, Data Scientist, Data Engineer
Salary/Wage Range or Industry Benchmark: 60000 - 80000 USD Yearly USD 60000.00 80000.00 YEAR
Job Description & How to Apply Below

Join to apply for the AI Data Engineer role at Veeva Systems

Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster. As one of the fastest-growing SaaS companies in history, we surpassed $2B in revenue in our last fiscal year with extensive growth potential ahead.

At the heart of Veeva are our values:
Do the Right Thing, Customer Success, Employee Success, and Speed. We're not just any public company – we made history in 2021 by becoming a public benefit corporation (PBC), legally bound to balancing the interests of customers, employees, society, and investors.

As a Work Anywhere company, we support your flexibility to work from home or in the office, so you can thrive in your ideal environment.

Join us in transforming the life sciences industry, committed to making a positive impact on its customers, employees, and communities.

The Role

This role is responsible for ensuring the reliability, accuracy, and safety of our Veeva AI Agents through rigorous evaluation and systematic validation methodologies. We're looking for experienced candidates with:

  • A meticulous, critical, and curious mindset with a dedication to product quality in a rapidly evolving technological domain
  • Exceptional analytical and systematic problem-solving capabilities
  • Excellent ability to communicate technical findings to both engineering and product management audiences
  • Ability to learn application areas quickly

Thrive in our Work Anywhere environment:
We support your flexibility to work remotely or in the office within Canada or the US, ensuring seamless collaboration within your product team's time zone.

What You’ll Do
  • Evaluation Strategy & Planning:
    Define and establish comprehensive evaluation strategies for new AI Agents. Prioritize the integrity and coverage of test data sets to reflect real-world usage and potential failure modes
  • LLM Output Integrity Assessment:
    Programmatically and manually evaluate the quality of LLM-generated content against predefined metrics (e.g., factual accuracy, contextual relevance, coherence, and safety standards)
  • Creating High-Fidelity Datasets:
    Design, curate, and generate diverse, high-quality test data sets, including challenging prompts and scenarios. Evaluate LLM outputs to proactively identify system biases, unsafe content, hallucinations, and critical edge cases
  • Automation of Evaluation Pipelines:
    Develop, implement, and maintain scalable automated evaluations to ensure efficient, continuous validation of agent behavior and prevent regressions with new features and model updates
  • Root Cause Analysis:
    Understand model behaviors and assist in the trace and root-cause analysis of identified defects or performance degradations
  • Reporting & Performance Metrics:
    Clearly document, track, and communicate performance metrics, validation results, and bug status to the broader development and product teams
Requirements
  • Data Integrity & Validation: A strong, specialized understanding of data quality principles, including methods for validating datasets against bias, integrity concerns, and quality standards. Ability to craft diverse and adversarial test data to uncover AI edge cases
  • Prompt Engineering & Model Expertise:
    Demonstrated skill in advanced prompt engineering techniques to create evaluation scenarios that test the AI's reasoning, action planning, and adherence to system instructions. Deep knowledge of LLM common failure modes (hallucination, incoherence, jail breaking)
  • Automated Evaluation Implementation: 5+ years of experience designing and deploying automated evaluation pipelines to assess complex, agentic AI behaviors. Familiarity with quality metrics such as task success rate, semantic similarity, and sentiment analysis for output measurement
  • Debugging Agentic Systems:
    Must be comfortable with the specific challenges of debugging agentic systems, including tracing and interpreting an agent's internal reasoning, tool use, and action sequence to pinpoint failure points
  • Programming & Frameworks: 5+ years of experience using Python to develop custom evaluation frameworks, writing scripts, and integrating pipelines…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary