Academic Researcher
Listed on 2026-06-03
-
Software Development
Data Scientist, AI Engineer
Join a leading AI lab's cutting-edge GenAI team to be at the core of the AI revolution, where your expertise fuels the development of the most advanced Large Language Models.
This is a W2 employment position with Cincinnatus LLC, with the opportunity to be placed at a leading AI Lab as part of their extended workforce.
1 OverviewWe are seeking Professors and PhD students across all academic disciplines — STEM (ML, Coding, Data Science, CS, Physics, Mathematics, Engineering, Statistics) as well as professional and quantitative domains (Finance, Accounting, Economics, Law, Business) — to contribute to a project supporting a frontier-model evaluation effort focused on coding and agentic workflows.
You’ll design and validate challenging benchmark tasks to help surface and diagnose reasoning and problem-solving gaps in a target model. The work centers on building robust, real-world tasks with executable Python tests and then analyzing model/agent behavior. All applicants are expected to have working proficiency in Python.
2Key Responsibilities
- Design challenging, real-world domain-specific problems drawn from your area of expertise (e.g., financial modeling, legal reasoning, econometrics, ML, coding, scientific computation) that serve as the foundation for agentic tasks. Problems should be constructed to target specific core capability loss failures identified in a frontier AI model.
- Integrate the problems into an agentic development environment, preparing all necessary components using Python.
- Evaluate the target model’s performance on the tasks.
- Identify tasks where the target model fails to pass all tests, specifically classifying the failure as a logical reasoning failure.
- Current or retired professor, OR PhD student, in any of the following areas:
- STEM: ML, Coding, Data Science, CS, Physics, Mathematics, Engineering, Statistics, Biology, Chemistry
- Professional / Quantitative:
Finance, Accounting, Economics, Law, Business - Degree (or PhD in progress) from a top university in your field.
- Working proficiency in Python — applied in research, industry, Git Hub, or coursework (not theoretical familiarity).
- Ability to engage reliably for at least 30 hours/week during weekdays (i.e. at least 6 hours/day during weekdays).
- Past experience in AI training, model evaluation and data annotation is preferred.
- Basic ability to work independently and manage one’s time.
- Verbal and written communication skills, problem solving skills, and interpersonal skills.
We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, reproductive health decisions, or related medical conditions), sexual orientation, gender identity, gender expression, age, status as a protected veteran, status as an individual with a disability, genetic information, political views or activity, or any other legally protected characteristic. We are committed to providing reasonable accommodations for qualified individuals with disabilities and disabled veterans throughout the job application process.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).