×
Register Here to Apply for Jobs or Post Jobs. X

Machine Learning Engineer - LLM Evaluation & Automation

Remote / Online - Candidates ideally in
Culver City, Los Angeles County, California, 90232, USA
Listing for: TEKsystems
Part Time, Contract, Remote/Work from Home position
Listed on 2026-05-26
Job specializations:
  • IT/Tech
    AI Engineer, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 60 - 70 USD Hourly USD 60.00 70.00 HOUR
Job Description & How to Apply Below

Overview

We are seeking a Machine Learning Engineer to join a high-impact team focused on advancing LLM evaluation, NLP, and AI-driven automation. This role centers on designing scalable evaluation frameworks, optimizing prompt strategies, and building systems that ensure high-quality, consistent model outputs across product domains. You will partner closely with product, engineering, and research teams to drive measurable improvements in AI performance.

This is a hands‑on role with a strong emphasis on LLM evaluation systems, prompt engineering, and data‑driven model optimization.

Job Details

Location: Culver City, CA (Hybrid with 3 days a week onsite)

Pay Rate: $60-70/hr (W2)

Job Type: Contract

Contract Length: 6 months

Experience Level: Mid-level to Senior

Key Responsibilities
  • Design and build LLM‑based evaluation frameworks, including automated scoring pipelines and rubric‑based grading systems
  • Build and maintain data pipelines for evaluation datasets using Python, SQL, and scalable processing tools
  • Translate complex evaluation results into clear, actionable insights for technical and non-technical stakeholders
  • Implement automation workflows and agentic evaluation systems to improve efficiency and reduce manual efforts
  • Develop prompt engineering strategies to evaluate output quality, accuracy, and consistency
  • Create and maintain metrics, KPIs, and dashboards to track and communicate model performance
  • Conduct error analysis, root‑cause investigations, and quality deep dives to guide model improvements
  • Partner cross‑functionally to define evaluation methodologies and integrate them into production workflows
Must‑Have Qualifications
  • 5+ years of experience in ML engineering, NLP, or AI/ML automation
  • Strong programming skills in Python and SQL
  • Deep understanding of machine learning concepts with a focus on NLP and advanced LLM capabilities (e.g., Chain‑of‑Thought, agentic workflows)
  • Experience working with large‑scale datasets and data pipelines
  • Strong experience with LLM evaluation, prompt engineering, or auto grading systems
  • Experience developing metrics and KPIs to measure model output quality and consistency
Nice‑to‑Have
  • Experience with LLM‑as‑judge systems or human + model evaluation frameworks
  • Background in inter‑rater reliability, evaluation calibration, or judged systems design
  • Experience with PySpark or distributed data processing tools
  • Exposure to building dashboards or visualization tools for model performance tracking
Technical Skills

Python, SQL, NLP, LLM Evaluation, Prompt Engineering, Machine Learning, Data Pipelines, Automation Systems

NOTE

This posting is for an existing vacancy.

Pay and Benefits
  • $60.00 - $70.00/hr

Benefits (subject to eligibility and employment length) may include:

  • Medical, dental & vision
  • Critical Illness, Accident, and Hospital
  • 401(k) Retirement Plan – Pre‑tax and Roth post‑tax contributions available
  • Life Insurance (Voluntary Life & AD&D for employee and dependents)
  • Short and long‑term disability
  • Health Spending Account (HSA)
    Transportation benefits
  • Employee Assistance Program
  • Time Off/Leave (PTO, vacation or sick leave)
Workplace Type

This is a fully remote position.

Final date to receive applications

Anticipated to close on Jun 3, 2026.

Equal Opportunity Employer

The company is an equal opportunity employer and will consider all applications without regard to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.

San Francisco Fair Chance Ordinance

For all positions located in the city and county of San Francisco, we will consider qualified applicants with arrest and conviction records.

Massachusetts Lie Detector

It is unlawful in Massachusetts to require or administer a lie detector test as a condition of employment or continued employment. An employer who violates this law shall be subject to criminal penalties and civil liability.

Use of Artificial Intelligence (AI)

We may use Artificial Intelligence (AI) to support parts of our hiring process, including sourcing, screening, and evaluating candidates. AI helps assess applications and qualifications, but final decisions are made by our hiring team. By applying, you acknowledge and agree that your application may be reviewed using AI tools.

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary