Agent Evals Specialist (Knowledge Graph Review)

Prox

CONTRACTOR Remote · US San Francisco, CA, US USD 1 / month Posted: 2026-05-11 Until: 2026-07-10

You will be redirected to the original job posting on BeBee.
Apply directly with the employer.

Job Description

A big part of Prox is AI agents that process complex technical documents into structured knowledge. The agents are right most of the time. When they're wrong, we need you to catch it. You'll work inside a review platform we built. Each task shows you the source material, what the agent produced, and the steps it took to get there. You compare them and grade the agent's work. What You'll Juggle Read the source and the agent's output side by side. Verify the content was captured accurately. Review what the agent did. What it created, changed, or left out. Score a short rubric covering accuracy, coverage, organization, and rule adherence. Full rubric provided at onboarding. Write detailed feedback about the mistake. This is the most important thing you produce since we use it to improve the agent. Submit. Move to the next task. Conditions Subject matter shifts over time. You don't need prior knowledge of the subjects. You need to be able to compare two documents carefully and spot where they disagree. Rate is fixed for the engagement. If it changes, it goes up, and we tell you before your next task. Work product owned by Prox (work-for-hire). Standard NDA at offer stage. Skills Required Strong written English Can read dense technical content for hours without losing focus Consistent scoring and clear, specific feedback Reliable on committed hours Preferred Prior AI trainer/evaluator experience (Outlier, DataAnnotation, xAI, Surge, Mercor, Invisible, Toloka) Technical writing, editing, QA, translation, paralegal, or research background