×
Register Here to Apply for Jobs or Post Jobs. X

ML Engineer, Dataset Curation Product

Job in Seattle, King County, Washington, 98127, USA
Listing for: Hyperparam Blog
Full Time position
Listed on 2025-12-08
Job specializations:
  • IT/Tech
    Data Scientist, AI Engineer, Machine Learning/ ML Engineer, Data Analyst
Salary/Wage Range or Industry Benchmark: 90000 - 130000 USD Yearly USD 90000.00 130000.00 YEAR
Job Description & How to Apply Below
Hyperparam ML Engineer – Dataset Curation Product Seattle, WA
· Full time Company website Apply for ML Engineer – Dataset Curation Product

Work on research and development of tools and techniques for ML dataset curation

About Hyperparam Description

What is the key to building the most advanced AI models? Data quality.

This opportunity is hybrid in-person in Seattle at a seed-stage startup. You would be one of the very first employees, working side-by-side with an experienced team building a new kind of dataset curation tool. This will require intense work ethic, dedication, creativity, and independence that is necessary at an early stage startup. For the right candidate, this is a unique opportunity to build a company from the earliest idea stages to building a product used by real customers.

Responsibilities:

  • Dataset Curation:
    Analyze, process, and clean large-scale datasets to ensure they meet quality and usability standards for machine learning applications.
  • Heuristic Development:
    Identify and design robust heuristics to filter, rank, and enhance datasets based on specific requirements.
  • Agent Development:
    Create and deploy intelligent agents that autonomously perform data cleaning, labeling, and curation tasks.
  • Quality Metrics:
    Define, track, and continuously improve dataset quality metrics, working towards tangible improvements in ML model performance.
  • Product Feedback:
    Collaborate closely with product and engineering teams, using the curation product to provide actionable feedback and prioritize enhancements.

You might be a great fit if you have:

  • Deep experience building products with LLMs. Should have experience using various APIs from Anthropic, OpenAI, etc. Familiar with tool calls and other advanced API features.
  • Experience with Data:
    Strong proficiency in working with structured and unstructured data; hands-on experience with data cleaning, processing, and transformation, and evaluation of ML datasets.
  • Algorithm Development:
    Proven ability to design and implement effective heuristics for data-related challenges.
  • Quality-Driven Mindset:
    Strong attention to detail and an obsession with "making the quality number go up" through iteration and experimentation.
  • Deep experience working with LLMs (e.g., GPT, Claude). If you aren’t using OpenAI and/or Anthropic models almost daily, it’s probably not a good fit.
  • Agent Creation:
    Iterative development of agentic systems to perform tasks. Experience with agentic frameworks like Lang Graph, Autogen, etc is a plus.
  • Familiarity with active learning, synthetic data generation, or semi-supervised learning techniques.
  • Excellent problem-solving abilities and attention to detail.
  • Ability to operate independently in a small startup environment.
  • Passion for staying current with emerging technologies and best practices.

What We Offer:

  • Get in on the ground level of a funded seed start startup.
  • Work side-by-side with experienced entrepreneurs who care deeply about advancing AI.
  • A collaborative and close-knit work environment with a small team of highly motivated engineers, located in-person in Seattle.
  • Competitive salary, equity, and comprehensive benefits package.
  • Opportunity to work on groundbreaking projects in the ML and data visualization space.

The ideal candidate is deeply passionate about accelerating AI progress. You're excited by the potential of using LLMs as tools to improve the quality and efficiency of dataset curation, seeing this as a key lever for advancing ML capabilities. You think critically about dataset quality and how it impacts model performance, and you're motivated by the challenge of building automated systems that can help create better training data  likely have hands-on experience with ML models and understand firsthand how dataset quality influences model behavior.

You don’t need to know frontend development, but you should be excited about the prospect of a more interactive, frontend-centric ML data platform. Most importantly, you're eager to work on systems that could help unlock the next generation of more capable AI models through better training data.

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary