Billable trust
Show clients your AI work got measurably better.
Before-and-after proof reports with scores, evidence, and version hashes you can hand to a stakeholder. Make "we improved it" a document, not a claim.
PROOF REPORT · sample
code-review skill
v3 · #a1f community
54
→
88
Reproducible evidence, not a claim.
A proof report is the artifact you hand over: score delta by dimension, the evidence behind each mark, and a version hash anyone can re-run. It sells reproducible evidence — never a certification. Sample shown below.
code-review skill
hash #a1f · re-runnable
Objective clarity
2/5
→
5/5
Output specification
2/5
→
4/5
Evaluation criteria
1/5
→
4/5
Evidence: v1 left “good” undefined; v2 adds an output contract and a stop rule. The eval case that v1 failed now passes.
A document, not a claim.
A proof report is the artifact you hand a stakeholder. It stands on its own and survives scrutiny because anyone can re-run it — the value is reproducible evidence, never a badge of authority.
Score delta by dimension
Before and after on each rubric dimension, so "better" is specific rather than a single headline number.
Evidence behind every mark
The finding that drove each score and the eval case that demonstrates the behaviour change.
A version hash
The exact artifact is identifiable, so the report points at something reproducible, not a moving target.
Re-runnable, not certified
We sell reproducible evidence — a client can run the same check and get the same result. We deliberately do not call it "certified."
Answers before you start.
A before-and-after for one instruction: the rubric score delta broken down by dimension, the evidence behind each finding, the eval case that demonstrates the behaviour change, and a version hash so the exact artifact is identifiable. A document a stakeholder can read, not a claim.
Know which instructions are ready to run.
Generate a proof reportFollow the review loop as it ships.
Notes on AI artifact testing, rubr_flow conversion, evals, and proof reports.