More jobs:
Generative AI Evaluator | Remote
Remote / Online - Candidates ideally in
Tempe, Maricopa County, Arizona, 85280, USA
Listed on 2026-06-07
Tempe, Maricopa County, Arizona, 85280, USA
Listing for:
Crossing Hurdles
Remote/Work from Home
position Listed on 2026-06-07
Job specializations:
-
IT/Tech
Data Scientist, Data Analyst -
Quality Assurance - QA/QC
Data Analyst
Job Description & How to Apply Below
Type: Hourly contract
Compensation: $20–$30/hour
Location: Remote
Commitment: 10–40 hours/week
Role Responsibilities- Evaluate outputs from large language models and autonomous agent systems using defined rubrics and quality standards.
- Review multi-step agent workflows, including screenshots and reasoning traces, to assess accuracy and completeness.
- Apply benchmarking criteria consistently while identifying edge cases and recurring failure patterns.
- Provide structured, actionable feedback to support model refinement and product improvements.
- Participate in calibration sessions to ensure consistent evaluation alignment across reviewers.
- Adapt to evolving guidelines and ambiguous scenarios with sound judgment.
- Document findings clearly and communicate insights to relevant stakeholders.
- Strong experience in LLM evaluation, AI output analysis, QA/testing, UX research, or similar analytical roles.
- Proficiency in rubric-based scoring, benchmarking frameworks, and AI quality assessment.
- Excellent attention to detail with strong decision-making skills in ambiguous cases.
- Proficient English communication skills (written and verbal).
- Ability to work independently in a remote environment.
- Comfortable committing to structured evaluation workflows and evolving guidelines.
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×