PRD Commissioning Protocol
When is a capability ready to ship — and how do you prove it?
The team that builds a system is never the team that commissions it. The builder knows what they intended. The commissioner checks what actually shipped.
Maturity Levels
Every Mycelium capability is scored on a 5-level maturity scale:
| Level | Meaning | Evidence Required |
|---|---|---|
| L0 | Spec only | PRD written, no build |
| L1 | Schema + API | Backend exists, no interface |
| L2 | UI connected | Users can interact |
| L3 | Tested | Automated verification + intent spec passes |
| L4 | Commissioned | Independent verification against PRD criteria |
Status Vocabulary
Exact definitions. No synonyms. If engineering and dream team use different words for the same state, the report lies.
| Status | Meaning | Evidence | NOT the same as |
|---|---|---|---|
| Gap | Identified need, no PRD | Mentioned in another PRD or index | Spec |
| Spec | PRD written, features undefined or unscored | PRD index.md exists | Spec draft |
| Spec draft | PRD exists, features listed but incomplete | Feature table present, gaps noted | L0 |
| Spec complete | PRD fully specified, ready for engineering | All sections filled, scored | L0 |
| L0 | Features assessed, all scored as Gap | Feature table with Gap/Done per feature | Spec |
| L1 | Schema + API deployed | Backend responds, no UI | L0 |
| L2 | UI connected, users can interact | Pages render, forms submit | L1 |
| L3 | Tested, automated + intent verification | E2E tests pass, intent spec verified, evidence captured | L2 |
| L4 | Commissioned by independent verification | Commissioner grants after browser walkthrough | L3 |
The critical distinction: "Spec" means features haven't been individually assessed. "L0" means they HAVE been assessed and scored as Gap. A PRD at L0 has more structure than one at Spec — it knows exactly what's missing.
Feature vs capability maturity: Feature commissioning (Install → Test → Operational → Optimize) tracks individual features within a capability. Capability maturity (L0-L4) tracks the aggregate. A capability at L2 may have features at Install, Test, and Gap simultaneously.
The Process
How a capability moves from L0 to L4:
L0: SPEC ONLY L1: SCHEMA + API L2: UI CONNECTED
PRD written -> Backend exists -> Users can interact
Features defined Schema deployed CRUD works
Success criteria API endpoints live Workflows complete
Kill signal set Integration tests Manual QA passes
L3: TESTED L4: COMMISSIONED
Automated tests -> Independent verification
E2E suite passes Commissioner reads spec
Performance gates Commissioner opens browser
Regression suite Commissioner checks features
Pass / Fail + evidence
The Protocol
- Commissioner reads the PRD (not the code)
- Commissioner opens the live application
- For each row in the Feature / Function / Outcome table:
- Can the feature be found?
- Does the function work as specified?
- Does the outcome match the success criteria?
- Record Pass / Fail with evidence (screenshot, recording, measurement)
- Update the PRD commissioning table
- If all critical features Pass: capability is L4 Commissioned
The commissioner is never the builder. The builder knows what they intended. The commissioner checks what actually shipped.
The Loop
Read PRD commissioning table (what should pass)
-> Navigate to deployed URL
-> Walk each feature row
-> Verify pass/fail with evidence (screenshot, GIF, console, network)
-> Update commissioning dashboard with findings
-> Gap between spec and reality drives next priority
Per-Feature Checklist
For each row in a PRD's commissioning table:
- Navigate — Can you reach the feature from the expected entry point?
- Happy path — Does the primary workflow complete successfully?
- Output correct — Does the result match the PRD's stated outcome?
- Error handling — Does a bad input produce a clear error, not a crash?
- Intent verified — If agentic: agent action stayed within declared scope (constraints, budget, permissions)
- Evidence captured — GIF or screenshot proving the above
Verification Channels
Each channel gets validated differently:
| Channel | What to Verify |
|---|---|
| Web UI | Features work as specified in PRD |
| API routes | Endpoints return correct data, response shape + status codes |
| A2A protocol | Agent Card discoverable, Task Cards accepted, task lifecycle response |
| Console health | No errors, no warnings in critical paths |
| Agent intent | Agent action matches declared scope — constraints, budget, permissions |
For which browser tool to use per channel, see the tool selection guide.
Flight Readiness
Before any capability ships to production, it must pass eight gates. Adapted from factory pre-flight inspection.
| Gate | Criteria | Test | Applies To |
|---|---|---|---|
| G1: Config | Version locked, zero uncommitted changes | git status clean on deploy branch | All |
| G2: Types | Zero TypeScript errors, strict mode | pnpm nx typecheck [app] | All |
| G3: Security | Auth + rate limits + CSP configured | Action validation audit | All |
| G4: Tests | Pass rate above threshold, documented skips | pnpm nx test [app] | All |
| G5: Performance | P95 response time within budget | Latency measurement under load | All with UI |
| G6: Observability | Four Golden Signals monitored | Latency, Traffic, Errors, Saturation | Production apps |
| G7: AI Safety | Prompt injection mitigated, hallucination bounded | Validation layer audit | AI capabilities |
| G8: Ops Ready | Rollback tested, runbook exists | Deployment verification | Production apps |
Golden Signals
Four signals for G6 observability:
| Signal | Metric | Threshold |
|---|---|---|
| Latency | P95 response time | Under 3s API, under 10s AI |
| Traffic | Concurrent users | Over 50 supported |
| Errors | Error rate | Under 5% |
| Saturation | Function timeout | Under 80% |
Phase to Level
How the venture algorithm maps to engineering maturity:
| Algorithm Phase | Typical L-Level | What's Happening |
|---|---|---|
| SCAN-DISCOVER | -- | No build. Exploring. |
| VALIDATE | L0 | Spec written, scored, kill signals identified |
| MODEL-FINANCE | L0-L1 | Business model selected, financial models built |
| STRATEGY | L1 | Positioning defined, GTM planned |
| PITCH-SELL | L1-L2 | Persuasion assets created, users can interact |
| MEASURE | L2+ | Feedback loop operational, scorecard active |
Verifiable Intent
Commissioning IS verifiable intent applied to software delivery. The delegation chain maps directly:
| VI Layer | Commissioning | What It Proves |
|---|---|---|
| L1 Identity | PRD author | Who specified the capability |
| L2 Intent | PRD spec + success criteria | What was authorized to be built |
| L3 Action | Engineering build | What was actually shipped |
| Verification | Commissioner walkthrough | Did action match intent? |
The builder (agent) acts within the PRD (intent). The commissioner (verifier) checks the delegation chain: spec matched build matched outcome. When agents build features, the same three-layer proof applies — L2 intent constraints become machine-verifiable acceptance criteria. A capability without an intent spec cannot reach L3.
Context
- Verifiable Intent — The authorization proof protocol that commissioning implements
- Commissioning — The principle: why independent verification matters, across domains
- AI Browser Tools — Tool selection for browser-based commissioning
- Commissioning Dashboard — Live status of every capability
- Work Prioritisation — Scoring algorithm, rubrics, gates
- Business Factory Requirements — The capability catalogue
- PRDs — How to spec capabilities
- Benchmark Standards — Trigger-based benchmark protocol
- Flow Engineering — After L4, stories become maps that produce code artifacts
- Cost of Quality — Enforcement tier metrics
- Cost Escalation — The 10x multiplier
Questions
When is a capability ready to ship — and how do you prove it without building it yourself?
- At what maturity level does a capability start generating revenue — and is L4 even necessary for first customers?
- Should flight readiness gates differ by capability type (platform vs product vs agent)?
- What's the cost of skipping L3 (tested) and going straight from L2 (UI connected) to L4 (commissioned)?
- How do you commission an AI capability when its outputs are distributions, not deterministic?