Verify request & input roles

A POST /v1/verify body is a rubric_id plus a submission. A submission is one of three shapes: {inline} text, a {uri} reference, or the typed roles object below. The typed shape is fully additive — it adds roles, it does not change or replace the inline/uri shapes.

In the typed shape, output is the only required role; every other role is optional. The judge receives each role labeled, so it knows which blob is the work, which is the authoritative source, and which is merely supporting.

Faithfulness (the headline): output + context

Send the work as output and its source as context. The judge checks the output against the context and flags claims that are not grounded in it — the core use case for summaries, RAG answers, and extractions.

The roles

output · required

The thing being judged — the work under verification. Every typed submission needs `output` (or its legacy alias content/text).

curl -s -X POST $API/v1/verify \
  -H "Authorization: Bearer $KEY" -H "Content-Type: application/json" \
  -d '{"rubric_id":"'$RUBRIC'","submission":{"output":"The answer under verification."}}'

context

The AUTHORITATIVE source the output is judged AGAINST. Faithfulness/grounding criteria treat claims that are un-grounded in — or contradicted by — the context as violations. The judge is told which blob is the source of truth.

curl -s -X POST $API/v1/verify \
  -H "Authorization: Bearer $KEY" -H "Content-Type: application/json" \
  -H "Idempotency-Key: roles-faithfulness-1" \
  -d '{"rubric_id":"'$RUBRIC'","submission":{"output":"Q2 revenue rose 12% to $4.2M and churn fell to 1.8%.","context":[{"label":"source","value":"Q2 results: revenue grew 12% quarter-over-quarter to $4.2M. Monthly churn improved from 2.4% to 1.8%."}]},"options":{"wait_ms":45000}}'

input

What was ASKED — the question, instruction, or task the output answered. Lets a criterion judge appropriateness/completeness against the request.

curl -s -X POST $API/v1/verify \
  -H "Authorization: Bearer $KEY" -H "Content-Type: application/json" \
  -d '{"rubric_id":"'$RUBRIC'","submission":{"input":"What is the refund window, and do disputes affect it?","output":"Refunds are accepted within 30 days of delivery; an open dispute pauses that clock until resolution."}}'

reference

The EXPECTED / gold answer to compare against, when you have one.

curl -s -X POST $API/v1/verify \
  -H "Authorization: Bearer $KEY" -H "Content-Type: application/json" \
  -d '{"rubric_id":"'$RUBRIC'","submission":{"output":"The capital of France is Paris.","reference":"Paris"}}'

evidence

Other supporting material. Rendered NON-authoritative — the judge weighs it but does not treat it as ground truth (that is what context is for).

curl -s -X POST $API/v1/verify \
  -H "Authorization: Bearer $KEY" -H "Content-Type: application/json" \
  -d '{"rubric_id":"'$RUBRIC'","submission":{"output":"The customer was refunded in full on 2026-05-14.","evidence":[{"label":"support-ticket","value":"Ticket #8821: refund of $42.00 issued 2026-05-14."}]}}'

values

Structured fields for deterministic numeric/json checks (the rubric’s deterministic criteria read these).

curl -s -X POST $API/v1/verify \
  -H "Authorization: Bearer $KEY" -H "Content-Type: application/json" \
  -d '{"rubric_id":"'$RUBRIC'","submission":{"output":"{\"invoice_number\":\"INV-48213\",\"total\":1250.5}","values":{"total":1250.5,"currency":"USD"}}}'

Backward compatibility

The roles object is additive. The pre-existing submission shapes keep working exactly as before — no field becomes required beyond output or its alias.

{inline} — unchanged

A plain text submission. Exactly as before.

-d '{"rubric_id":"'$RUBRIC'","submission":{"inline":"INV-4821 total $1,250.50"}}'

legacy content / text → output

If output is absent but the legacy content (or text) field is present, it is treated AS output. Existing callers need no change.

-d '{"rubric_id":"'$RUBRIC'","submission":{"content":"The answer text, via the legacy content field."}}'

{uri} — unchanged

A stored-by-reference submission (https:// or r2://). Never fetched on the request path.

-d '{"rubric_id":"'$RUBRIC'","submission":{"uri":"https://example.com/artifacts/answer.txt"}}'

When the judge is unsure

Each criterion returns a calibrated confidence. Below the acceptance threshold the verdict is not guessed — the criterion is returned flagged: needs_review and the caller decides what to do with it. A missing, insufficient, or contradictory source yields an honest abstention rather than a confident verdict. The signed proof captures exactly what was judged and the context it was judged against, so any verdict can be re-verified independently.