Commissioning Protocol

When is a capability ready to ship — and how do you prove it?

The team that builds a system is never the team that commissions it. The builder knows what they intended. The commissioner checks what actually shipped.

Maturity Levels

Every Mycelium capability is scored on a 5-level maturity scale:

Level	Meaning	Evidence Required
L0	Spec only	PRD written, no build
L1	Schema + API	Backend exists, no interface
L2	UI connected	Users can interact
L3	Tested	Automated verification passes
L4	Commissioned	Independent verification against PRD criteria

The Process

How a capability moves from L0 to L4:

L0: SPEC ONLY           L1: SCHEMA + API        L2: UI CONNECTED
PRD written        ->   Backend exists      ->   Users can interact
Features defined        Schema deployed          CRUD works
Success criteria        API endpoints live       Workflows complete
Kill signal set         Integration tests        Manual QA passes

L3: TESTED              L4: COMMISSIONED
Automated tests    ->   Independent verification
E2E suite passes        Commissioner reads spec
Performance gates       Commissioner opens browser
Regression suite        Commissioner checks features
                        Pass / Fail + evidence

The Protocol

Commissioner reads the PRD (not the code)
Commissioner opens the live application
For each row in the Feature / Function / Outcome table:
- Can the feature be found?
- Does the function work as specified?
- Does the outcome match the success criteria?
Record Pass / Fail with evidence (screenshot, recording, measurement)
Update the PRD commissioning table
If all critical features Pass: capability is L4 Commissioned

The commissioner is never the builder. The builder knows what they intended. The commissioner checks what actually shipped.

The Loop

Read PRD commissioning table (what should pass)
  -> Navigate to deployed URL
    -> Walk each feature row
      -> Verify pass/fail with evidence (screenshot, GIF, console, network)
        -> Update commissioning dashboard with findings
          -> Gap between spec and reality drives next priority

Per-Feature Checklist

For each row in a PRD's commissioning table:

Navigate — Can you reach the feature from the expected entry point?
Happy path — Does the primary workflow complete successfully?
Output correct — Does the result match the PRD's stated outcome?
Error handling — Does a bad input produce a clear error, not a crash?
Evidence captured — GIF or screenshot proving the above

Verification Channels

Each channel gets validated differently:

Channel	What to Verify
Web UI	Features work as specified in PRD
API routes	Endpoints return correct data, response shape + status codes
A2A protocol	Agent Card discoverable, Task Cards accepted, task lifecycle response
Console health	No errors, no warnings in critical paths

For which browser tool to use per channel, see the tool selection guide.

Flight Readiness

Before any capability ships to production, it must pass eight gates. Adapted from factory pre-flight inspection.

Gate	Criteria	Test	Applies To
G1: Config	Version locked, zero uncommitted changes	`git status` clean on deploy branch	All
G2: Types	Zero TypeScript errors, strict mode	`pnpm nx typecheck [app]`	All
G3: Security	Auth + rate limits + CSP configured	Action validation audit	All
G4: Tests	Pass rate above threshold, documented skips	`pnpm nx test [app]`	All
G5: Performance	P95 response time within budget	Latency measurement under load	All with UI
G6: Observability	Four Golden Signals monitored	Latency, Traffic, Errors, Saturation	Production apps
G7: AI Safety	Prompt injection mitigated, hallucination bounded	Validation layer audit	AI capabilities
G8: Ops Ready	Rollback tested, runbook exists	Deployment verification	Production apps

Golden Signals

Four signals for G6 observability:

Signal	Metric	Threshold
Latency	P95 response time	Under 3s API, under 10s AI
Traffic	Concurrent users	Over 50 supported
Errors	Error rate	Under 5%
Saturation	Function timeout	Under 80%

Phase to Level

How the venture algorithm maps to engineering maturity:

Algorithm Phase	Typical L-Level	What's Happening
SCAN-DISCOVER	--	No build. Exploring.
VALIDATE	L0	Spec written, scored, kill signals identified
MODEL-FINANCE	L0-L1	Business model selected, financial models built
STRATEGY	L1	Positioning defined, GTM planned
PITCH-SELL	L1-L2	Persuasion assets created, users can interact
MEASURE	L2+	Feedback loop operational, scorecard active

Context

Commissioning — The principle: why independent verification matters, across domains
AI Browser Tools — Tool selection for browser-based commissioning
Commissioning Dashboard — Live status of every capability
Work Prioritisation — Scoring algorithm, rubrics, gates
Phygital Mycelium — The capability catalogue
PRDs — How to spec capabilities
Benchmark Standards — Trigger-based benchmark protocol
Flow Engineering — Maps that produce code artifacts
Cost of Quality — Enforcement tier metrics
Cost Escalation — The 10x multiplier

Questions

When is a capability ready to ship — and how do you prove it without building it yourself?

At what maturity level does a capability start generating revenue — and is L4 even necessary for first customers?
Should flight readiness gates differ by capability type (platform vs product vs agent)?
What's the cost of skipping L3 (tested) and going straight from L2 (UI connected) to L4 (commissioned)?
How do you commission an AI capability when its outputs are distributions, not deterministic?

Maturity Levels​

The Process​

The Protocol​

The Loop​

Per-Feature Checklist​

Verification Channels​

Flight Readiness​

Golden Signals​

Phase to Level​

Context​

Questions​