Rubrkit
AI instruction grading instrument

Grade, rewrite, and test your AI instructions.

Grade prompts, agents, skills, commands, and workflows against a clear rubric — then turn the weak ones into testable, version-tracked instructions.
Clear objective
Bounded behavior
rubr_flow ready

Specimen RBR-082

Grade 82
EDITOR MARK

Undefined success criteria

The model can answer, but it has no reliable way to know what good means.


Objective clarity

4/5

Output specification

3/5

Evaluation criteria

2/5

9-dimension rubric

Every audit scores objective, context, constraints, output, and verification.

Web, CLI & MCP

Run the same audits from the app, the rubrkit npm CLI, or any MCP client.

rubr_flow

An open, documented format for bounded, testable agent procedures.

Free at launch

Credit-metered so every run shows why it costs what it does.

How it works

An editorial loop for instructions that need to hold.

Paste an instruction

Add a prompt, command, skill, agent spec, workflow, or rubr_flow block.

Get a scored critique

Rubrkit grades clarity, context, constraints, output shape, and evaluation criteria.

Rewrite with checks

Get a stronger version plus simple evals that prove whether it works.

What Rubrkit reviews

One grading system for the instructions AI teams actually reuse.

Rubrkit detects the artifact type and applies the rubric that matches how it should behave.

Prompt
Agent
Command
Skill
Workflow

Generating more text is not the point. Rubrkit shows you what is weak, why it matters, how to fix it, and whether the fix can survive a real eval.

Example before / after

From vague request to testable instruction.

Before · 46/100

Write a professional launch email for my new course and make it engaging.

After · 84/100

Write a launch email for [TARGET AUDIENCE] that drives [PRIMARY GOAL]. Use a clear subject line, three short sections, one CTA, and avoid claims that are not supported by [CONTEXT].

rubr_flow

A native format for instructions an agent can actually follow.

For teams that need stricter control, rubr_flow turns loose intent into a compact procedure an agent can follow.

Explicit work order
Bounded edits
Pass/fail verification
Before

Intent without machinery

Review our onboarding flow and fix anything confusing.

After

rubr_flow procedure

TASK "Improve onboarding completion"
CONTEXT
  user is new to [PRODUCT]
  primary action is [TARGET ACTION]
INPUTS
  current_flow = app screens
  analytics_notes = drop-off data
  support_themes = user confusion reports
RULES
  change only copy and step order
  preserve required legal text
ON missing_context
  ASK user "Which onboarding detail is missing?" -> missing_detail
FLOW
  REVIEW each screen -> friction_notes
  RANK issues by user impact
  EDIT the highest-impact issue
OUTPUT
  changed_copy: final text
  rationale: why this improves completion
  risk_notes: constraints preserved
VERIFY
  PASS WHEN user can identify the next action in one pass

This is not another prompt type. It is the control surface for turning intent into bounded agent work.

Run an audit
Developers

Bring Rubrkit into your toolchain.

Sync graded artifacts into the repos where your agents run, or connect any MCP client to the same audits, bundles, and rubr_flow tools the web app uses.

CLI
rubrkit on npm

Pull approved artifact bundles into local projects and place them where Codex, Claude, or generic agents expect their instructions.

# Pull approved artifacts into your project
npx rubrkit pull

# Place them where your agent expects them
npx rubrkit pull all --agent claude --yes
MCP server
Public endpoint

Point any MCP client at Rubrkit and call the same artifact bundle, audit, and rubr_flow tools, authenticated with your Rubrkit API key.

{
  "mcpServers": {
    "rubrkit": {
      "url": "https://rubrkit.com/api/v1/mcp",
      "headers": {
        "Authorization": "Bearer <your-rubrkit-api-key>"
      }
    }
  }
}
Pricing

Free now, built for serious review loops.

Pro and Team workflows are in preview while the Rubrkit review loop is being tuned.

Free

$0

For quick checks and first rewrites.

Limited audits

Basic score

Top 3 issues

One rewrite

Run an audit
Pro
Full rubric

Preview

For builders who reuse and test instructions.

Full rubric

Advanced rewrites

Eval kit generation

Version comparison

Saved library

Exports

Run an audit
Team

Preview

For shared standards and review workflows.

Team library

Shared rubric library

Admin controls

Private examples

Review reports

Run an audit
FAQ

Questions before you grade your first instruction.

No. Every score points to a rubric dimension with evidence. You get a marked-up critique, a stronger version, and eval checks that test whether the fix actually holds — not a black-box rewrite.

Know which instructions are ready to run.

Grade a specimen, read the marks, ship the rewrite that survives an eval.

Run an audit
Newsletter

Follow the review loop as it ships.

Notes on AI artifact testing, rubr_flow conversion, evals, and proof reports.