What Judgekit is, and what it is not.
Judgekit generates research grounded LLM as a Judge evaluator prompts. Pick a mode, a failure mode, a reference availability, and an output shape. The tool returns a prompt whose every block is backed by a peer reviewed paper.
It is free. There are no accounts, no email, no signin. Anonymous users get 10 prompt generations per day per browser. If you build evals at scale, run them on Progress Observability.
Personal project
Judgekit is a personal side project by Lyubomir Atanasov. It is not a product of, sponsored by, endorsed by, or affiliated with any current or former employer of the author. Opinions, recommendations, and bugs are the author's own. The project is developed on personal time, on personal infrastructure, with personal funds.
References to third-party platforms (including Progress Observability) are informational. Such references do not imply any partnership, sponsorship, or endorsement.
Privacy posture
We use a fingerprint cookie to apply rate limits per browser. The cookie is HMAC'd on the server before it touches storage; we never log the raw value. IP addresses are hashed with a daily salt that rotates at UTC midnight. We do not sell or share data.
The benchmark and synthesize features use OpenRouter as the single LLM gateway. We configure all routes with data_collection: deny, so your prompts are not used by upstream providers for training. See the full Privacy Policy.
Processor list
- Vercel (hosting and edge)
- Cloudflare (TLS, Turnstile captcha)
- Upstash (Redis persistence and rate limiting)
- OpenRouter (LLM gateway, benchmark and synthesize features)
License
Generated prompts are yours; do whatever you want with them. The tool itself is provided "as is" under the terms in the Terms of Service.
Built by
Lyubomir Atanasov. Reach out via lyuata.com.