Skip to main content

Commissioning Protocol

When is a capability ready to ship — and how do you prove it?

The team that builds a system is never the team that commissions it. The builder knows what they intended. The commissioner checks what actually shipped.

Maturity Levels

Every Mycelium capability is scored on a 5-level maturity scale:

LevelMeaningEvidence Required
L0Spec onlyPRD written, no build
L1Schema + APIBackend exists, no interface
L2UI connectedUsers can interact
L3TestedAutomated verification passes
L4CommissionedIndependent verification against PRD criteria

The Process

How a capability moves from L0 to L4:

L0: SPEC ONLY           L1: SCHEMA + API        L2: UI CONNECTED
PRD written -> Backend exists -> Users can interact
Features defined Schema deployed CRUD works
Success criteria API endpoints live Workflows complete
Kill signal set Integration tests Manual QA passes

L3: TESTED L4: COMMISSIONED
Automated tests -> Independent verification
E2E suite passes Commissioner reads spec
Performance gates Commissioner opens browser
Regression suite Commissioner checks features
Pass / Fail + evidence

The Protocol

  1. Commissioner reads the PRD (not the code)
  2. Commissioner opens the live application
  3. For each row in the Feature / Function / Outcome table:
    • Can the feature be found?
    • Does the function work as specified?
    • Does the outcome match the success criteria?
  4. Record Pass / Fail with evidence (screenshot, recording, measurement)
  5. Update the PRD commissioning table
  6. If all critical features Pass: capability is L4 Commissioned

The commissioner is never the builder. The builder knows what they intended. The commissioner checks what actually shipped.

The Loop

Read PRD commissioning table (what should pass)
-> Navigate to deployed URL
-> Walk each feature row
-> Verify pass/fail with evidence (screenshot, GIF, console, network)
-> Update commissioning dashboard with findings
-> Gap between spec and reality drives next priority

Per-Feature Checklist

For each row in a PRD's commissioning table:

  • Navigate — Can you reach the feature from the expected entry point?
  • Happy path — Does the primary workflow complete successfully?
  • Output correct — Does the result match the PRD's stated outcome?
  • Error handling — Does a bad input produce a clear error, not a crash?
  • Evidence captured — GIF or screenshot proving the above

Verification Channels

Each channel gets validated differently:

ChannelWhat to Verify
Web UIFeatures work as specified in PRD
API routesEndpoints return correct data, response shape + status codes
A2A protocolAgent Card discoverable, Task Cards accepted, task lifecycle response
Console healthNo errors, no warnings in critical paths

For which browser tool to use per channel, see the tool selection guide.

Flight Readiness

Before any capability ships to production, it must pass eight gates. Adapted from factory pre-flight inspection.

GateCriteriaTestApplies To
G1: ConfigVersion locked, zero uncommitted changesgit status clean on deploy branchAll
G2: TypesZero TypeScript errors, strict modepnpm nx typecheck [app]All
G3: SecurityAuth + rate limits + CSP configuredAction validation auditAll
G4: TestsPass rate above threshold, documented skipspnpm nx test [app]All
G5: PerformanceP95 response time within budgetLatency measurement under loadAll with UI
G6: ObservabilityFour Golden Signals monitoredLatency, Traffic, Errors, SaturationProduction apps
G7: AI SafetyPrompt injection mitigated, hallucination boundedValidation layer auditAI capabilities
G8: Ops ReadyRollback tested, runbook existsDeployment verificationProduction apps

Golden Signals

Four signals for G6 observability:

SignalMetricThreshold
LatencyP95 response timeUnder 3s API, under 10s AI
TrafficConcurrent usersOver 50 supported
ErrorsError rateUnder 5%
SaturationFunction timeoutUnder 80%

Phase to Level

How the venture algorithm maps to engineering maturity:

Algorithm PhaseTypical L-LevelWhat's Happening
SCAN-DISCOVER--No build. Exploring.
VALIDATEL0Spec written, scored, kill signals identified
MODEL-FINANCEL0-L1Business model selected, financial models built
STRATEGYL1Positioning defined, GTM planned
PITCH-SELLL1-L2Persuasion assets created, users can interact
MEASUREL2+Feedback loop operational, scorecard active

Context

Questions

When is a capability ready to ship — and how do you prove it without building it yourself?

  • At what maturity level does a capability start generating revenue — and is L4 even necessary for first customers?
  • Should flight readiness gates differ by capability type (platform vs product vs agent)?
  • What's the cost of skipping L3 (tested) and going straight from L2 (UI connected) to L4 (commissioned)?
  • How do you commission an AI capability when its outputs are distributions, not deterministic?