×
Register Here to Apply for Jobs or Post Jobs. X

Applied Artificial Intelligence Researcher; Benchmarking

Job in New York, New York County, New York, 10261, USA
Listing for: Distyl
Full Time position
Listed on 2026-05-29
Job specializations:
  • Research/Development
    Data Scientist, Research Scientist
Salary/Wage Range or Industry Benchmark: 60000 USD Yearly USD 60000.00 YEAR
Job Description & How to Apply Below
Position: Applied Artificial Intelligence Researcher (Benchmarking)
Location: New York

Requirements

  • At Distyl we’re pushing the envelope of AI utilization in enterprise. This requires creative researchers who don’t just want to drive incremental improvements on benchmarks or optimize an existing process but instead are looking to creatively redefine how software is used
  • ,
  • Experience Designing and Running Evaluations:
    You’ve built or maintained benchmarks, test suites, or experimental frameworks to measure model or system performance
  • ,
  • Statistical and Analytical Rigor:
    You design fair, reproducible experiments and can extract signal from noisy empirical results
  • ,
  • Experience Building with Models, Not Just Building Models:
    We develop intelligent systems using models rather than training or fine-tuning them. Ideal candidates have expertise in compound AI systems, agentic collaboration, and associated techniques (ensembling, ReAct, graph-of-thoughts, etc.)
  • ,
  • Proven Track Record of Research Results:
    Whether you’ve published in top journals, posted amazing work on twitter, or somewhere else we want to see what you've done
  • ,
  • Uses AI Every Day:
    Before you can revolutionize someone else’s workflow, you need to revolutionize yours. You should be using tools like ChatGPT, Cursor, and Perplexity to accelerate your workflow
  • ,
  • Strong Programming and Data Analysis

    Skills:

    While you might not consider yourself a software engineer you need to be able to build prototypes of your ideas and then perform the experiments to prove the effectiveness to a F500 Head of AI
  • ,
  • Biases Towards Showing vs Telling:
    Our customers want to see the power of AI today vs discuss the most elegant idea that will take 5 years to realize
What the job involves
  • Our researchers come from many academic backgrounds but have strong research track records, operate in an AI-native way, and would be bored staying on the rails of a traditional research org
  • ,
  • The Benchmarking team defines how progress is measured. Researchers design evaluation frameworks that capture reasoning depth, interaction quality, reliability, and operational impact
  • ,
  • They construct benchmarks that reflect real-world complexity. Their systems become the standard by which new architectures, techniques, and releases are judged
  • ,
  • Researchers in Benchmarking explore new paradigms for evaluating intelligent systems: adversarial robustness testing, longitudinal performance tracking, and human-in-the-loop assessment
  • ,
  • They investigate how metrics shape model behavior and establish rigorous methodologies for quantifying emergent capability. Their insights drive both Distyl’s internal research priorities and industry-wide standards
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary