LLM as a Judge, prompt generator
Build an evaluator your judge actually agrees with humans on.
Paste a trace and get a research-grounded judge (evaluator prompt) with drop-in code and a 3-judge stress test. Free, no signup.
Paste an existing trace, span, or system prompt.
We pre-fill the wizard below from what you paste. You review and edit.
or build manually
What are you evaluating?
Pointwise scores one response. Pairwise compares two, only useful if you A/B test.
Mode
Template
This judge will read {{input}} and {{output}}. Progress auto-detects these on import. No variable mapping needed.
Step 1 of 4
Live preview
Evaluator Prompt
You are an impartial evaluator. Decide whether a single response satisfies one criterion.
Criterion: faithfulness.
pass: Every factual claim in the response is supported by the reference. No invented entities, numbers, or quotes.
fail: The response contains at least one claim that is not supported by, or directly contradicts, the reference.
Procedure:
1. Enumerate every factual claim in the response: named entities, numbers, dates, quotes, attributions, and definitive statements about the world.
2. For each claim, locate the specific span in the reference that supports it.
3. If a claim has no supporting span, mark it unsupported. If the reference contradicts a claim, mark it contradicted.
4. Direct paraphrases of explicit reference content are acceptable. Speculative leaps and additions are not.
5. Pass only if every claim is either directly supported or a faithful paraphrase. A single unsupported or contradicted claim fails.
6. Decide pass or fail.
Ground your verdict in the reference, not in your own world knowledge. If the reference is silent on a claim, treat it as unsupported. Do not fill gaps from what you already know.
Ignore length, formatting, and self-identification cues. Do not reward verbosity.
Now evaluate the following:
Input:
{{input}}
Output:
{{output}}
Reference:
{{reference}}
Briefly explain your reasoning in 2 to 5 sentences, then state your verdict.Score Range Prompt
Use one word only: pass or fail.