Blog

Notes from the review loop.

How we think about grading AI instructions against a rubric, writing rewrites you can test, and versioning prompts, agents, and skills like the production code they are.

June 18, 2026 · 4 min read

Stop grading prompts on vibes

A prompt that "feels good" is not a prompt you can trust. Here is why we grade instructions against a rubric instead of a gut feeling.

Rubrics

Methodology

June 12, 2026 · 5 min read

Anatomy of a testable rewrite

A weak instruction, the rubric dimensions it fails, the rewrite that fixes them, and the eval that proves the rewrite holds.

Rewrites

Evals

June 5, 2026 · 4 min read

Version your prompts like code

Prompts, agents, and skills are production instructions. They deserve history, provenance, and a single source of truth — not a paste buried in a chat log.

Registry

Workflow