How do agents orchestrate — and what instruments verify the work?
This diagram synthesises all four preceding maps: outcome defines what we measure, value stream reveals where time dies, dependencies show what blocks fixes, capabilities confirm we can execute.
AGENT & INSTRUMENT DIAGRAM: AGENT PLATFORM
════════════════════════════════════════════════════════════
AGENTS (Yang — who applies force)
─────────────────────────────────────────────────────────
Nav (Dream Agent) ··········· Set priorities, commission outcomes, audit
Engineering Agent (instance) · Build, test, report via drmg CLI
Orchestrator Agent ·········· Route work, resolve blocks, manage plans
VVFL Auditor Agent ·········· Measure enforcement health, extract patterns
Commissioning Agent ········· Walk deployed URLs, verify against PRD
Wik (Human) ················· Architecture decisions, graduation gates, go/no-go
INSTRUMENTS (Yin — what verifies and rewards)
─────────────────────────────────────────────────────────
Priority Table ·········· Build order instrument (top of table = build next)
Session Timer ··········· Recovery time measurement (<30s target, from O1/F1)
Block Signal ············ Broadcast instrument (<2min target, from F2)
VVFL Dashboard ·········· 8-dimension health score (8 auditors, from O4)
Commissioning Table ····· L0-L4 maturity per component (pass/fail per feature)
Outcome Dashboard ······· O1-O4 directly: recovery time, duplication, cycle time, findings
Plan Completion Rate ···· Template effectiveness (% plans that ship vs stall)
Memory Recall Accuracy ·· Did recalled context match what was needed? (from O1)
PROTOCOLS (data + value + decision flows)
─────────────────────────────────────────────────────────
THREE-CHANNEL ARCHITECTURE
┌────────────────────────────────────────────────────┐
│ │
│ Channel 1: FILESYSTEM (spec interface) │
│ ┌──────────┐ ┌──────────┐ │
│ │ Dream │ reads → │ Eng │ │
│ │ Repo │ ← reads │ Repo │ │
│ └──────────┘ └──────────┘ │
│ Priority table = build order │
│ PRD commissioning = pass/fail │
│ │
│ Channel 2: CONVEX MESSAGES (coordination) │
│ ┌──────────┐ 8 types ┌──────────┐ │
│ │ Dream │ ←─────→ │ Eng │ │
│ │ Agents │ status │ Agents │ │
│ └──────────┘ handoff └──────────┘ │
│ blocker │
│ decision │
│ complete │
│ question │
│ context │
│ system │
│ │
│ Channel 3: SUPABASE (measurement interface) │
│ ┌──────────┐ ┌──────────┐ │
│ │ Dream │ reads → │ drmg │ │
│ │ Agents │ │ CLI │ writes │
│ └─ ─────────┘ └──────────┘ │
│ Measurements, plans, patterns, agent state │
│ │
└────────────────────────────────────────────────────┘
AGENT SESSION LIFECYCLE (from value stream)
┌────────────────────────────────────────────────────┐
│ │
│ DISPATCH ──→ BOOTSTRAP ──→ PLAN ──→ │
│ │ │ │ │
│ Priority Session Plan │
│ Table Timer Completion │
│ │ │ │ │
│ ──→ EXECUTE ──→ REPORT ──→ COMMISSION ──→ │
│ │ │ │ │
│ Block Convex Commissioning │
│ Signal Messages Table │
│ │ │ │ │
│ └── FEEDBACK ──────────┘ │
│ │ │
│ Improves next cycle │
└────────────────────────────────────────────────────┘
FEEDBACK LOOPS
────────────────────────────────────────── ───────────────
Loop 1: SESSION RECOVERY (per session start)
┌─ Bootstrap → Timer > 30s? → Diagnose memory load → Fix → Bootstrap ─┐
└────────────────────────────────────────────────────────────────────┘
Instrument: Session Timer. Measures: O1 (recovery -80%), F1 (<30s)
Loop 2: COORDINATION (per block event)
┌─ Execute → Blocked? → Signal → Orchestrator routes help → Execute ─┐
└────────────────────────────────────────────────────────────────────┘
Instrument: Block Signal. Measures: O2 (duplicate work -80%), F2 (<2min)
Loop 3: SHIPPING (per feature cycle)
┌─ Plan → Execute → Report → Commission → Gap? → Reprioritise → Plan ─┐
└──────────────────────────────────────────────────────────────────────┘
Instrument: Commissioning Table + Outcome Dashboard. Measures: O3 (cycle time 2x)
Loop 4: LEARNING (weekly/after 3+ runs)
┌─ VVFL Audit → Patterns? → Semantic Memory → Agent Recall → Audit ─┐
│ After 5 runs: zero patterns → improvements? Kill measurement. │
└────────────────────────────────────────────────────────────────────┘
Instrument: VVFL Dashboard + Memory Recall Accuracy. Measures: O4 (50% findings resolved)
Loop 5: GRADUATION (quarterly)
┌─ CLI proven? → API routes → CLI as thin client → A2A wrappers ─┐
│ Kill signal: API doesn't reduce integration friction vs CLI? │
│ → Ship CLI as the product. Don't abstract. │
└────────────────────────────────────────────────────────────────┘
Instrument: Time-to-first-query comparison. Measures: F7 (API parity), F8 (Agent Card), F9 (A2A task)
GRADUATION PATH
─────────────────────────────────────────────────────────
H1: Agent Platform (this PRD)
│ session data, memory patterns, VVFL measurements
▼
H2: BOaaS Operations (same platform, customer-facing)
│ API routes, multi-tenant, Agent Cards
▼
H3: Open Agent Mesh (A2A protocol)
any agent joins via Task Cards, no filesystem needed
════════════════════════════════════════════════════════════
Agent Roster
| Agent | Type | Force Applied | Verification |
|---|---|---|---|
| Nav (Dream Agent) | AI | Set priorities, commission outcomes, audit enforcement | Priority Table, Commissioning Table, Outcome Dashboard |
| Engineering Agent | AI (instance) | Build code, run tests, report status via drmg CLI | Plan Completion Rate, Block Signal |
| Orchestrator Agent | AI | Route work to teams, resolve blocks, manage plan lifecycle | Block Signal response time, Plan Completion Rate |
| VVFL Auditor Agent | AI | Measure 8 dimensions, extract patterns, write to memory | VVFL Dashboard, Memory Recall Accuracy |
| Commissioning Agent | AI | Walk deployed URLs, capture evidence against PRD features | Commissioning Table (L0-L4 per component) |
| Wik (Human) | Human | Architecture decisions, graduation gates, go/no-go, kill signals | All instruments (final authority) |
Instrument Registry
| Instrument | Measures | Threshold | Action on Fail | Outcome Map Link |
|---|---|---|---|---|
| Priority Table | Build order alignment | Top row = current work | Reorder if work doesn't match table | Guards all outcomes |
| Session Timer | Time from session start to productive | Under 30 seconds | Diagnose: memory load, message load, or both | O1 (-80% recovery) |
| Block Signal | Time from block to help arriving | Under 2 minutes | Escalate: orchestrator response broken | O2 (-80% duplication), F2 |
| VVFL Dashboard | 8-dimension enforcement health | All dimensions measured | Identify stale/broken: generator, template, rule, skill, agent, platform, virtue, pattern | O4 (50% findings resolved) |
| Commissioning Table | L0-L4 per component, per PRD | L2+ for MVP features | Block shipping until commission passes | O3 (cycle time 2x) |
| Outcome Dashboard | O1-O4 directly | O1≥80%, O2≥80%, O3≥2x, O4≥50% | Weekly review: which loop is broken? | All outcomes |
| Plan Completion Rate | % plans that ship vs stall | Above 70% completion | Template audit: wrong template = rework | O3 (cycle time 2x) |
| Memory Recall Accuracy | Did recalled context match need? | Above 80% relevant | Prune noisy memories, improve relevance scoring | O1 (-80% recovery) |
Feedback Loop Quality
| Loop | Frequency | Sensor | Actuator | Latency |
|---|---|---|---|---|
| Session Recovery | Per session | Session Timer | Memory/message loader tuning | Seconds |
| Coordination | Per block event | Block Signal | Orchestrator routing | Minutes |
| Shipping | Per feature cycle | Commissioning Table | Priority table reorder | Days |
| Learning | Weekly / 3+ runs | VVFL Dashboard | Pattern → semantic memory → recall | 1 week |
| Graduation | Quarterly | Integration friction comparison | Transport layer swap | Quarter |
Activation Sequence
From the Dependency Map — when does each agent come online?
| Phase | What Resolves | Agent State | Human Covers |
|---|---|---|---|
| Phase 0 (0.5d) | --path flag on ETL CLI | Engineering agents can load any profile | Human manually starts sessions, routes work |
| Phase 1 (4.5d) | Shared DB, thin router, VVFL auditors | VVFL Auditor online, drmg CLI usable | Human still dispatches, commissions manually |
| Phase 2 (2d) | CLI wrappers, agent module | Engineering agents use one entry point (drmg) | Human reviews plans, approves dispatches |
| Phase 3 (2d) | Priority dispatch, session bootstrap | Orchestrator online: dispatch automated, recovery under 30s | Human focuses on architecture and go/no-go |
| Phase 4 (2d) | Pattern extractor, memory writer | Learning engine online: cross-session patterns persist | Human reviews patterns, validates learning quality |
| Phase 5 (2d) | Commissioning dispatch, virtue auditor | Commissioning Agent online: automated verification | Human focuses on graduation decisions |
| Phase 6-7 (5d) | API routes, A2A wrappers | Full mesh: any agent joins via protocol | Human sets strategy, everything else is platform |
Each phase transfers one bottleneck from human to platform. The Capability Map P1 gaps (shared DB, thin router, session bootstrap, priority dispatch, 8-dimension auditors, flexible path, session extract) are the first three phases of transfers.
Gate
Before executing:
- Every agent named — YES (6 agents: Nav, Engineering, Orchestrator, VVFL Auditor, Commissioning, Human)
- Every instrument named — YES (8 instruments with thresholds and outcome map links)
- Feedback loops explicit — YES (5 loops at different frequencies, each with sensor/actuator)
- Agent-to-agent handoffs documented — YES (three-channel architecture: filesystem, Convex, Supabase)
- Horizon data flows declared — YES (H1 → H2 → H3: platform → BOaaS → open mesh)
- Activation sequence links dependency resolution to agent capability — YES (7 phases, each transfers a bottleneck)
- Each loop instrument links to outcome map success measures — YES (O1-O4, F1, F2 mapped)
Context
- Capability Map — Previous: what can we do
- A&ID Template — The empty pattern
- Agent Platform PRD — Full depth: five concerns, 38 features, build sequence