Agent Platform

What makes an agent more than a prompt with memory?

Problem

A problem well-stated is a problem 80% solved

Situation: Four PRDs described one system from four angles — session boot, project management, settlement verification, and the platform itself. Engineering asked "where do I start?" Three CLIs, 23 templates, 16 agents — no coordination. Agents spend 5-10 minutes re-exploring what happened last session. Issues live in a markdown file that can't scale. Zero capabilities at L4 because no automated settlement verification exists.

Intention: One platform where every agent inherits identity, memory, comms, dispatch, issue routing, quality enforcement, and settlement verification — so engineering builds domain knowledge, not infrastructure, and the system audits itself.

Obstacle: 4.3% flow efficiency. 130 minutes of work in 1-4 days of wait time. Dispatch is manual. Context recovery starts from zero. 8 min per issue in markdown. 188 capabilities tracked, 0 at L4. The tool must build itself.

Priorities

What makes an agent more than a prompt with memory?
Where does time die in the agent session lifecycle?
How does an agent know whether to fix, file, or build?
Does what arrived match what was promised?
Who coordinates — and what verifies the work?

Features

Phase 0: Session Boot — DONE

Consolidated from Nav Continuity Layer. All 3 features at L2.

NAV-001: Briefing compiler (nav-briefing.md generated at session start)
NAV-002: Memory ledger (MEMORY.md + WM-NAV.md + daily context)
NAV-003: Boot hook (team-activation.sh loads context via nav-session-boot.mjs)

Zero manual re-briefing over last 10+ sessions. Kill signal passed: compiler runs in <10s, injects <2000 tokens.

Phase 1: Unified CLI + VVFL MVP (3-4 sessions)

Engineering Spec. 8 dimensions: generator, template, rule, skill, agent, platform, virtue, pattern.

Extract shared DB context from plan-cli pattern
Build thin router (drmg dispatches to handlers)
Seed context graph from filesystem
Build 8-dimension auditors (one per enforcement dimension)
Wire audit command with --dry-run flag

Phase 2: Issue Tracking + Plans UI (2-3 sessions)

Consolidated from Agent Project Management. Engineering Spec.

PLAT-001: Issue tracking CLI (#1-7) — replace issues-log.md with queryable store
PROJ-001: Plans dashboard UI (#8-12) — wire plan list/detail/analytics to DB
Decision routing logic (fix / PRD / story)
Priority dispatch (priority table change → Convex message)

Phase 3: Quality Loop (2 sessions)

The Deming shift: from inspection (hooks catch violations) to prevention (generators make violations impossible). Measured by hook trigger rate trending toward zero.

AGNT-003: Receipt writer (post-session: store receipt via agent-etl-cli)
AGNT-004: Hook failure tracker (aggregate triggers by content type, pattern, agent)
AGNT-005: Scaffold generators (content types born with correct frontmatter, heading skeleton, questions)
AGNT-006: Skill maturity scorer (invocations x artifacts x gate pass rate)
AGNT-007: Template improvement trigger (receipt pattern → template update → legacy principle)
PLAT-005: DB-native plan template registry (schema enforces best_pattern_prompt NOT NULL — enables AGNT-007 to write back to DB)

Phase 4: Settlement Bridge (2-3 sessions)

Consolidated from Hemisphere Bridge. Engineering Spec. Pictures.

Two repositories. One dreams (this repo — specs, PRDs, scoreboard). One builds (stackmates — code, schemas, hooks). Every handoff is a settlement boundary where trust can break.

PLAT-002: Settlement verification protocol (automated spec-vs-reality delta)
PLAT-003: Trust ledger (settlement history in Supabase)
PLAT-004: Autonomous loop control (/loop heartbeat)
AGNT-008: Agent boundary enforcement hooks
NAV-004: Scoreboard live readings (L2 — content graph in session boot)

Three bearings: Internal (hooks, mostly virtuous), Hemisphere (manual, vicious — this phase fixes it), External (no production users yet).

Phase 5: Learning Engine (2 sessions)

Pattern extractor (cross-run trend detection)
Memory writer (patterns → semantic, runs → episodic)
Agent recall of VVFL patterns (shared semantic memory query)
Action generator (critical findings → plan issues)

Phase 6: API + A2A (3-4 sessions)

API route per drmg module (REST endpoints wrapping CLI handlers)
CLI as thin client (drmg calls API routes instead of direct DB)
Auth + rate limiting (API keys, per-agent rate limits)
AI-008: Agent Card definition (advertise capabilities per agent type)
Task Card wrapper (map message types to A2A Task lifecycle)
A2A discovery endpoint (/.well-known/agent.json)

Feature IDs

ID	Feature	State	Phase
NAV-001	Briefing compiler	L2	0
NAV-002	Memory ledger	L2	0
NAV-003	Boot hook	L2	0
NAV-004	Scoreboard live readings	L2	4
AGNT-001	Standards registry	L2	1
AGNT-002	SBO Scoreboard	L2	1
AGNT-003	Receipt schema	L0	3
AGNT-004	Hook failure tracker	L0	3
AGNT-005	Scaffold generators	L0	3
AGNT-006	Skill maturity scorer	L0	3
AGNT-007	Template improvement loop	L0	3
AGNT-008	Agent boundary enforcement	L0	4
AI-008	Multi-agent orchestration	L2	6
PLAT-001	Issue tracking CLI	L0	2
PLAT-002	Settlement verification	L0	4
PLAT-003	Trust ledger	L0	4
PLAT-004	Autonomous loop control	L0	4
PLAT-005	DB-native template registry	L0	3
PROJ-001	Plans dashboard UI	L0	2

7/19 features at L2+. 12 at L0.

Issues

#	Severity	What Happens	Fix
26	HIGH	`/plans` shows 0 plans, no create button. DB has plans via CLI but UI reads nothing.	Wire plan list query to DB. Add create plan action.
27	HIGH	`/plans/analytics` shows hardcoded zeros. No charts, no trends.	Query plan data for metrics. Add trend chart.
28	MEDIUM	`/calendar` Team Calendar shows "No Plans Scheduled" despite plans in DB.	Wire calendar to plan date ranges.
29	MEDIUM	`/plans` table has columns but no data, no row click navigation.	Populate from DB, add row click to detail page.
30	LOW	`/plans` stat cards render but show static zeros.	Replace with live DB aggregates.
17	MEDIUM	`/agents/capabilities` returns 404. Dashboard card links to missing page.	Create the route or remove the link.
10	LOW	`/.well-known/agent.json` returns HTML instead of Agent Card JSON.	Add static agent.json to public directory or API route.

Resolved

#	Resolved	Evidence
12	2026-03-07	`/agents` loads with Registry, Workflows, Standards, Register Agent button.
7	2026-03-07	`team-activation.sh` uses `git branch --show-current`. No stale branch.
8	2026-03-07	Hook degrades gracefully. `timeout 5` + fallback in place.
9	2026-03-08	Spec updated to match `WM-NAV.md` + `working-memory.json`.

Progress

Scorecard

Priority Score: 1500 (Pain 5 x Demand 5 x Edge 4 x Trend 5 x Conversion 3)

#	Priority (should we?)	Preparedness (can we?)
1	Pain: 5 — 5min re-briefing + 8min/issue + 0 at L4	Principles: 4 — five concerns defined, consolidation done
2	Demand: 5 — every agent needs identity, memory, comms, dispatch, verification	Performance: 2 — 61% capabilities at zero
3	Edge: 4 — self-orchestrating platform with settlement verification	Platform: 4 — 3 CLIs built, Convex deployed, session boot functional
4	Trend: 5 — A2A protocol, autonomous agents, multi-repo orchestration	Protocols: 3 — 7 phases defined, Phase 0 complete
5	Conversion: 3 — internal infra that unblocks all ventures, indirect but mandatory	Players: 3 — 6 agents specified, 8 instruments named

Metric	Target	Now
Session recovery time	<30s	<10s (Phase 0 done)
Flow efficiency	>15%	4.3%
Capabilities at L2+	>60%	39% (7/18)
Capabilities at L4	>0	0

Scope	Phases	What You Get
Phase 0	0	Session boot — DONE
MVP	0-1	+ drmg CLI + 8 VVFL auditors
V1	0-2	+ Issue CLI + Plans UI + priority dispatch
Quality	0-3	+ receipts, hook tracker, scaffolds, maturity scorer
Settlement	0-4	+ settlement verification, trust ledger, `/loop`
Learning	0-5	+ pattern extraction, memory writing, action generation
API + A2A	0-6	+ HTTP transport, Agent Cards, external mesh

Kill date: none — mandatory infrastructure.

Kill signals:

If agents with memory aren't measurably faster than agents without it, the memory is noise
If compiler takes >10s or injects >2000 tokens, pare down scope
Engineering still edits issues-log.md manually 2 sprints after CLI ships
If settlement verification adds >30s to each commit cycle, simplify scope

Blocks: All agent instances (Sales Dev, Content Amplifier, future ventures).

Ledger

Context

Nav Continuity Spec — Phase 0 engineering spec
Agent Project Mgmt Spec — Phase 2 engineering spec (absorbed)
Hemisphere Bridge Spec — Phase 4 engineering spec
Phase 1 Engineering Spec — 8-dimension auditors and audit output schema
Hemisphere Bridge Pictures — A&ID diagrams + gap matrices
Scoreboard — Settlement integrity as primary structure
Game Loops — Four nested loops: rendering, gameplay, core, meta
Commissioning Dashboard — 188 capabilities tracked
Issues Log — Current file-based implementation (replaced by Phase 2)

Questions

What is the minimum platform surface that makes every subsequent PRD faster to build?

If Phase 0 eliminated re-briefing, which remaining friction is the next 80/20 win?
At what point does settlement verification become more valuable than building new features?
When an issue sits unrouted for a sprint, is that a tooling problem or a process problem?
How do you measure whether the quality loop (Phase 3) is preventing defects or just catching them?

Problem​

Priorities​

Features​

Phase 0: Session Boot — DONE​

Phase 1: Unified CLI + VVFL MVP (3-4 sessions)​

Phase 2: Issue Tracking + Plans UI (2-3 sessions)​

Phase 3: Quality Loop (2 sessions)​

Phase 4: Settlement Bridge (2-3 sessions)​

Phase 5: Learning Engine (2 sessions)​

Phase 6: API + A2A (3-4 sessions)​

Feature IDs​

Issues​

Resolved​

Progress​

Scorecard​

Context​

Questions​