Rubrkit vs Promptfoo
Promptfoo is an open-source CLI for evaluating and red-teaming LLM apps: you define prompts and test cases in config files and run assertions and security scans from your repo. Rubrkit grades instruction quality against a rubric, explains each mark, and produces a stakeholder-readable proof report — across agents, skills, and workflows, staying model-neutral. Choose Promptfoo for config-driven assertions and security red-teaming; choose Rubrkit for a falsifiable quality verdict you can hand to a non-engineer.
How Rubrkit and Promptfoo compare
| Dimension | Rubrkit | Promptfoo |
|---|---|---|
Primary job | Grade instruction quality and prove the improvement | Run assertion-based evals and red-team tests from the CLI |
Artifact types | Prompts, agents, skills, commands, workflows, and rubr_flow | Prompts and test cases defined in config |
Quality model | Rubric score 0–5 per dimension with the evidence behind each mark | Pass/fail assertions you author per test case |
Security red-teaming | Not a red-teaming tool | Built-in scans for prompt injection, PII, and jailbreaks |
Stakeholder output | A shareable proof report a non-engineer can read | CLI output and reports aimed at developers |
Ease of first signal | Grade an artifact against a ready rubric — no config to write | Write a config and test cases before you get a result |
CLI / CI | npx rubrkit plus CI quality gates | CLI-first and CI-friendly by design |
Ownership / neutrality | Independent and model-neutral | OpenAI-owned since March 2026; core stays MIT and model-agnostic |
Pick the tool that fits the job
Choose Rubrkit when
Teams who want a rubric-backed quality verdict and a readable proof report across prompts, agents, and skills — without writing a test config first.
Choose Promptfoo when
Engineers who want config-driven, repo-resident assertions and built-in security red-teaming run from the CLI.
Promptfoo’s security red-teaming — prompt-injection, PII, and jailbreak scanning — is genuinely stronger than anything Rubrkit offers. If adversarial testing is your goal, Promptfoo is purpose-built for it and Rubrkit is not.
Rubrkit and Promptfoo, answered.
See how your instructions score in ~20 seconds.
Grade an instructionFollow the review loop as it ships.
Notes on AI artifact testing, rubr_flow conversion, evals, and proof reports.