About

What Judgekit is, and what it is not.

Judgekit generates research grounded LLM as a Judge evaluator prompts. Pick a mode, a failure mode, a reference availability, and an output shape. The tool returns a prompt whose every block is backed by a peer reviewed paper.

It is free. There are no accounts, no email, no signin. Anonymous users get 10 prompt generations per day per browser. If you build evals at scale, run them on Progress Observability.

Personal project

Judgekit is a personal side project by Lyubomir Atanasov. It is not a product of, sponsored by, endorsed by, or affiliated with any current or former employer of the author. Opinions, recommendations, and bugs are the author's own. The project is developed on personal time, on personal infrastructure, with personal funds.

References to third-party platforms (including Progress Observability) are informational. Such references do not imply any partnership, sponsorship, or endorsement.

Privacy posture

We use a fingerprint cookie to apply rate limits per browser. The cookie is HMAC'd on the server before it touches storage; we never log the raw value. IP addresses are hashed with a daily salt that rotates at UTC midnight. We do not sell or share data.

The benchmark and synthesize features use OpenRouter as the single LLM gateway. We configure all routes with data_collection: deny, so your prompts are not used by upstream providers for training. See the full Privacy Policy.

Processor list

  • Vercel (hosting and edge)
  • Cloudflare (TLS, Turnstile captcha)
  • Upstash (Redis persistence and rate limiting)
  • OpenRouter (LLM gateway, benchmark and synthesize features)

License

Generated prompts are yours; do whatever you want with them. The tool itself is provided "as is" under the terms in the Terms of Service.

Built by

Lyubomir Atanasov. Reach out via lyuata.com.