×
Register Here to Apply for Jobs or Post Jobs. X

Senior Software Engineer - AI Interaction Evaluator; Codex​/Claude Code

Job in Miami, Miami-Dade County, Florida, 33222, USA
Listing for: G2i
Full Time position
Listed on 2026-06-08
Job specializations:
  • Software Development
    Software Engineer, AI Engineer (Applied/Software)
Salary/Wage Range or Industry Benchmark: 200 USD Hourly USD 200.00 HOUR
Job Description & How to Apply Below
Position: Senior Software Engineer - AI Interaction Evaluator (Codex / Claude Code, up to $200/hr)
Senior AI Interaction Evaluator (Codex / Claude Code)

Contract | $50-200/hr | 10+ hrs/week | Project-based

Roles open on a rolling basis - apply to join the talent bench and we'll reach out when one matches. Expect 40+ hrs once a project starts; timing depends on availability, but we move people in at the earliest genuine opportunity.

These roles are currently filled but we hire on a rolling basis as new projects open up. Apply now to join our talent bench - qualified candidates will be contacted directly when roles become available.

Check out this Loom video for more details!

We're looking for highly experienced software engineer (SR+) to help evaluate the quality of interactions with modern coding agents such as OpenAI Codex and Claude Code.

This is not a traditional engineering role.

You won't be writing production code.
You'll be evaluating something harder: whether the model thinks like a great engineer.

What This Role Actually Is

You will assess how AI coding agents behave in real-world scenarios - focusing on:
  • Whether the response makes sense
  • Whether the preamble and reasoning are useful
  • Whether the output reflects strong engineering judgment
  • Whether the interaction feels right to an experienced developer
This role is about engineering taste - not syntax correctness.

What You'll Be Doing
  • Evaluate AI-generated coding interactions end-to-end
  • Judge whether outputs are:
    • Useful
    • Correct (at a high level)
    • Aligned with how a strong engineer would think
  • Assess the quality of explanations and reasoning, not just code
  • Distinguish between different levels of response quality (e.g. what makes something a 2 vs
    4)
  • Provide clear, opinionated feedback on:
    • What worked
    • What didn't
    • What felt "off" or misleading
  • Help define what great looks like when interacting with tools like Cursor
What We Mean by "Taste"

We're specifically looking for engineers who can answer questions like:
  • Does this feel like something a strong engineer would actually say?
  • Is this explanation helpful, or just technically correct?
  • Is the model guiding the user well, or just dumping output?
  • Would this interaction build or erode trust?
You should be comfortable making subjective but rigorous judgments.

Who You Are
  • Staff / Principal-level engineer (or equivalent experience)
  • Strong background in one of the below:
    • Type Script / Java Script
    • Python
  • Hands-on experience using:
    • OpenAI Codex
    • Claude Code
    • Cursor
  • Deep familiarity with modern AI-assisted dev workflows
  • Able to evaluate code without needing to fully execute or deeply review every line
  • Comfortable giving direct, opinionated feedback
  • High bar for what "good engineering" looks like
Nice to Have
  • Experience with tools like Cursor or similar AI-first IDEs
  • Prior exposure to prompt design or evaluation workflows
  • Experience mentoring senior engineers or defining engineering standards
Engagement Details
  • US and Canada up to $200/hr
  • EU and Latam up to $150/hr
  • Other locations up to $100/hr
  • Hours: ~10-20 hours/week
  • Duration: Through early May (with possible extension)
  • Start: ASAP
  • Process:
    • Take-home evaluation exercise
    • One behavioral interview
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary