×
Register Here to Apply for Jobs or Post Jobs. X

Terminal Bench Expert

Remote / Online - Candidates ideally in
400001, Mumbai, Maharashtra, India
Listing for: MillionLogics
Full Time, Remote/Work from Home position
Listed on 2026-06-06
Job specializations:
  • IT/Tech
    Systems Engineer, AI Engineer
Salary/Wage Range or Industry Benchmark: 200000 INR Monthly INR 200000.00 MONTH
Job Description & How to Apply Below
Company Description
Million Logics, a trusted Oracle Partner, is a global IT solutions leader with a presence in London, UK, and a development hub in Hyderabad, India. Specializing in transformative technologies, the company empowers organizations through Data & AI services, Cloud migrations, and enterprise application optimization, with a strong focus on Oracle Cloud and database technologies. With a dedicated team of over 55+ AI experts, Million Logics tailors cutting-edge IT solutions to drive tangible outcomes for clients.

Guided by a commitment to innovation and excellence, Million Logics delivers strategic IT consulting, custom application development, and security architecture solutions, among other offerings, to help businesses unlock their full potential. Discover more about their team and services at:

Role Description
This is a contract-based remote position for a Terminal Bench Expert. We are

looking for highly analytical engineers, researchers, and domain specialists to contribute benchmark tasks for AI agent evaluation systems (e.g., Terminal-Bench). Design realistic, technically deep tasks simulating real-world scenarios such as debugging, data corruption, infrastructure failures, and complex workflows.

Offer Details:
Mode of work:
Fully Remote
Pay:

INR 1.25 to INR 2 lakhs per month (net/take-home)
Duration: 12 months (likely extended)

Experience:

3-10 years
Number of positions: 28
Evaluations: 1 round of technical interview

What does day-to-day look like:
Design high-quality Terminal-Bench task ideas and specifications.
Develop complex tasks requiring reasoning, investigation, and debugging.
Write clear task descriptions, solution approaches, and verification logic.
Define deterministic, outcome-based evaluation criteria.
Identify realistic failure modes, edge cases, and operational constraints.
Create tasks that challenge AI systems while remaining solvable by experts.
Collaborate with reviewers to refine task quality and difficulty.
Contribute expertise across one or more specialized domains.

Required Skills:

3–10 years of experience in software engineering or relevant domains.
Strong debugging, reasoning, and analytical skills.
Good understanding of system design, workflows, and dependencies.
Ability to analyze complex systems across multiple layers.

Experience with production systems, pipelines, or large-scale workflows.
Strong technical writing and documentation skills.
Exposure to LLMs, agentic systems, or AI evaluation frameworks.
Experience reviewing technical specifications or designing validation logic.

Additional Details:
Commitments

Required:

40 hours per week with overlap of 4 hours with PST

Employment type

:
Contractor assignment (no medical/paid leave)

How to apply?

Please send us your updated CV to

with email subject:

TERMINAL BENCH
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary