← Automated Commissioning · Prompt Deck · Spec

Automated Commissioning — Pictures

One map. Instrument classification — the outcome map reveals what the spec doesn't already say.

Outcome Map — Binary success measures for each capability

Outcome Map

What does success look like? Every row is a binary test.

Instrument Outcome

Outcome	Measure	Pass Condition	Current
Feature state is computed, not typed	`feature-matrix.json` updated by script, not human	Git blame shows script commit, not manual edit	FAIL — all states hand-edited
Same input produces same output	Two runs on same test results produce identical JSON	`diff` of two runs = empty	FAIL — no script exists
Missing mappings are visible	Features with no test file mapping show `unmapped`	Count of unmapped features in report	FAIL — unmapped features silently stay at L0
Safety violations block L3	Feature with failing Safety Test cannot reach L3	Script enforces: safety_fail → max(state, L2)	FAIL — no safety check exists
Stale states detected	Feature whose tests now fail gets demoted	State moves DOWN when tests regress	FAIL — states only go up manually

Evidence Flow

PRD spec/index.md          →  FAVV parser extracts feature_id → test_file[]
                               ↓
Engineering repo tests     →  Vitest runs scoped to mapped files
                               ↓
JSON reporter output       →  L-level computer maps results to feature IDs
                               ↓
feature-matrix.json        ←  Writer updates state + updated columns
                               ↓
.ai/receipts/              ←  Report generator logs evidence trail

L-Level Decision Tree

Has PRD with FAVV Build Contract?
  NO  → L0 (Spec only)
  YES → Has schema/type in engineering repo?
    NO  → L0
    YES → Has UI page/component using schema?
      NO  → L1 (Schema)
      YES → All mapped tests pass?
        NO  → L2 (UI exists, tests fail)
        YES → Any Safety Test violations?
          YES → L2 (safety blocks L3)
          NO  → Independent commissioner sign-off?
            NO  → L3 (Tested)
            YES → L4 (Commissioned)

Key Finding

The decision tree is deterministic at every node except L4. L4 requires a human commissioner — the builder and commissioner are never the same person. Everything below L4 is computable from test results.

Context

PRD Index — Automated Commissioning
Prompt Deck — 5-card pitch
Spec — Engineering depth

Questions

What happens when a feature's tests pass but the feature doesn't work?

If the Success Test is weak (passes with empty arrays), is L3 a lie?
Should the Safety Test column be the primary gate, not the Success Test?
At what scale does running all mapped tests on every merge become too slow?

Outcome Map​

Instrument Outcome​

Evidence Flow​

L-Level Decision Tree​

Key Finding​

Context​

Questions​