Validate Outcomes
Did you deliver what you said you would?
The gap between intention and reality is the honest error signal. The builder knows what they intended. The commissioner checks what actually shipped. These are never the same person.
Commissioning closes the feedback loop. Without it, the pipeline from pain to spec produces artifacts nobody verifies. Specs accumulate. Confidence erodes. The loop runs open.
Three credibility loops run through this pipeline. The inner loop (L1-L3) proves the code works. The story loop (L3 bridge) proves your predictions match your results. The market loop (L4) proves others validate with their behavior. Market credibility is the greatest force — but it can only land on a foundation where inner and story loops are tight. The SPEC-MAP is the shared traceability artifact that keeps both inner loops honest.
Fidelity Levels
| Common Language | Our Term | What It Means |
|---|---|---|
| Prototype | L0-L1 | Idea captured, code exists but unverified |
| Alpha | L2 | Core flow works, engineer verified |
| Beta | L3 | Independent commissioner verified against spec |
| Production | L4 | External users validate with their behavior |
Dig Deeper
- Validate Internal Standards — Engineering checklist: types, tests, performance gates, security. Does the code meet its own contracts?
- Validate Results — L0-L4 commissioning protocol. Does the deployed capability match the PRD spec? Independent verification with evidence
Commissioning Workflow
The commissioner reads. Then walks. Then records.
| Step | Action | Output |
|---|---|---|
| 1 | Read PRD and SPEC-MAP | Know what was promised |
| 2 | Open the deployed capability | See what exists |
| 3 | Walk every feature row | Happy path, error path, edge cases |
| 4 | Record evidence | Screenshot, GIF, console output, measurement |
| 5 | Update SPEC-MAP L-level | Gap, L0, L1, L2, L3, or L4 |
| 6 | Mark L4 if all features pass | Or document the gap and route it back |
Authority
The commissioner is never the builder. Three powers:
| Power | When | Effect |
|---|---|---|
| HOLD | Feature fails spec | Blocks promotion until gap is closed |
| Re-spec | Spec was wrong, not build | Routes signal back to Dream Team |
| Kill | Effort exceeds value | Recommends PRD status change to STOP |
Cadence
| Trigger | Action |
|---|---|
| Feature reaches L3 | Commission in same cycle |
| L4 verification | Next cycle (independent commissioner) |
| 5 PRDs reach L2+ | Run integral calibration |
| Quarterly minimum | Full scoring recalibration regardless of volume |
Return Signal
Commission finds gap → SPEC-MAP updated → Dream reads gap → spec evolves → next build cycle. This IS the VVFL Reflect station. Without it, the loop runs open.
| Gap Type | Owner | Action |
|---|---|---|
| Build gap (code doesn't match spec) | Engineering | Fix and re-deploy |
| Spec gap (spec was wrong) | Dream Team | Re-spec, re-score, possibly kill |
| Instrument gap (can't verify) | Platform | Build the MCP or tool needed |
Context
- VVFL — The loop this validates — commissioning is the Reflect station made mechanical
- Feature Matrix — Live commissioning status for every capability
- Flow Engineering — The build process this validates
- Create PRD Stories — The spec this verifies against
- Commissioning — The principle: why independent verification matters
- Verifiable Intent — L4 commissioning IS verifiable intent for software
- Credibility — Commissioning evidence feeds all three credibility loops
- Predictions — Gap between predicted and actual IS the learning signal
Questions
When the gap between spec and reality is large, is the spec wrong or the build wrong — and how do you tell?
- At what maturity level does a capability start generating value — is L4 necessary for first customers?
- What's the cost of the builder commissioning their own work — and how often does it happen without anyone noticing?
- When commissioning reveals a spec gap (not a build gap), how does that signal flow back to the Dream Team?
