Skip to main content

Agent Platform

What makes an agent more than a prompt with memory?

Problem

A problem well-stated is a problem 80% solved

Situation: Four PRDs described one system from four angles — session boot, project management, settlement verification, and the platform itself. Engineering asked "where do I start?" Three CLIs, 23 templates, 16 agents — no coordination. Agents spend 5-10 minutes re-exploring what happened last session. Issues live in a markdown file that can't scale. Zero capabilities at L4 because no automated settlement verification exists.

Intention: One platform where every agent inherits identity, memory, comms, dispatch, issue routing, quality enforcement, and settlement verification — so engineering builds domain knowledge, not infrastructure, and the system audits itself.

Obstacle: 4.3% flow efficiency. 130 minutes of work in 1-4 days of wait time. Dispatch is manual. Context recovery starts from zero. 8 min per issue in markdown. 188 capabilities tracked, 0 at L4. The tool must build itself.

Priorities

  1. What makes an agent more than a prompt with memory?
  2. Where does time die in the agent session lifecycle?
  3. How does an agent know whether to fix, file, or build?
  4. Does what arrived match what was promised?
  5. Who coordinates — and what verifies the work?

Features

Phase 0: Session Boot — DONE

Consolidated from Nav Continuity Layer. All 3 features at L2.

  • NAV-001: Briefing compiler (nav-briefing.md generated at session start)
  • NAV-002: Memory ledger (MEMORY.md + WM-NAV.md + daily context)
  • NAV-003: Boot hook (team-activation.sh loads context via nav-session-boot.mjs)

Zero manual re-briefing over last 10+ sessions. Kill signal passed: compiler runs in <10s, injects <2000 tokens.

Phase 1: Unified CLI + VVFL MVP (3-4 sessions)

Engineering Spec. 8 dimensions: generator, template, rule, skill, agent, platform, virtue, pattern.

  • Extract shared DB context from plan-cli pattern
  • Build thin router (drmg dispatches to handlers)
  • Seed context graph from filesystem
  • Build 8-dimension auditors (one per enforcement dimension)
  • Wire audit command with --dry-run flag

Phase 2: Issue Tracking + Plans UI (2-3 sessions)

Consolidated from Agent Project Management. Engineering Spec.

  • PLAT-001: Issue tracking CLI (#1-7) — replace issues-log.md with queryable store
  • PROJ-001: Plans dashboard UI (#8-12) — wire plan list/detail/analytics to DB
  • Decision routing logic (fix / PRD / story)
  • Priority dispatch (priority table change → Convex message)

Phase 3: Quality Loop (2 sessions)

The Deming shift: from inspection (hooks catch violations) to prevention (generators make violations impossible). Measured by hook trigger rate trending toward zero.

  • AGNT-003: Receipt writer (post-session: store receipt via agent-etl-cli)
  • AGNT-004: Hook failure tracker (aggregate triggers by content type, pattern, agent)
  • AGNT-005: Scaffold generators (content types born with correct frontmatter, heading skeleton, questions)
  • AGNT-006: Skill maturity scorer (invocations x artifacts x gate pass rate)
  • AGNT-007: Template improvement trigger (receipt pattern → template update → legacy principle)
  • PLAT-005: DB-native plan template registry (schema enforces best_pattern_prompt NOT NULL — enables AGNT-007 to write back to DB)

Phase 4: Settlement Bridge (2-3 sessions)

Consolidated from Hemisphere Bridge. Engineering Spec. Pictures.

Two repositories. One dreams (this repo — specs, PRDs, scoreboard). One builds (stackmates — code, schemas, hooks). Every handoff is a settlement boundary where trust can break.

  • PLAT-002: Settlement verification protocol (automated spec-vs-reality delta)
  • PLAT-003: Trust ledger (settlement history in Supabase)
  • PLAT-004: Autonomous loop control (/loop heartbeat)
  • AGNT-008: Agent boundary enforcement hooks
  • NAV-004: Scoreboard live readings (L2 — content graph in session boot)

Three bearings: Internal (hooks, mostly virtuous), Hemisphere (manual, vicious — this phase fixes it), External (no production users yet).

Phase 5: Learning Engine (2 sessions)

  • Pattern extractor (cross-run trend detection)
  • Memory writer (patterns → semantic, runs → episodic)
  • Agent recall of VVFL patterns (shared semantic memory query)
  • Action generator (critical findings → plan issues)

Phase 6: API + A2A (3-4 sessions)

  • API route per drmg module (REST endpoints wrapping CLI handlers)
  • CLI as thin client (drmg calls API routes instead of direct DB)
  • Auth + rate limiting (API keys, per-agent rate limits)
  • AI-008: Agent Card definition (advertise capabilities per agent type)
  • Task Card wrapper (map message types to A2A Task lifecycle)
  • A2A discovery endpoint (/.well-known/agent.json)

Feature IDs

IDFeatureStatePhase
NAV-001Briefing compilerL20
NAV-002Memory ledgerL20
NAV-003Boot hookL20
NAV-004Scoreboard live readingsL24
AGNT-001Standards registryL21
AGNT-002SBO ScoreboardL21
AGNT-003Receipt schemaL03
AGNT-004Hook failure trackerL03
AGNT-005Scaffold generatorsL03
AGNT-006Skill maturity scorerL03
AGNT-007Template improvement loopL03
AGNT-008Agent boundary enforcementL04
AI-008Multi-agent orchestrationL26
PLAT-001Issue tracking CLIL02
PLAT-002Settlement verificationL04
PLAT-003Trust ledgerL04
PLAT-004Autonomous loop controlL04
PLAT-005DB-native template registryL03
PROJ-001Plans dashboard UIL02

7/19 features at L2+. 12 at L0.

Issues

#SeverityWhat HappensFix
26HIGH/plans shows 0 plans, no create button. DB has plans via CLI but UI reads nothing.Wire plan list query to DB. Add create plan action.
27HIGH/plans/analytics shows hardcoded zeros. No charts, no trends.Query plan data for metrics. Add trend chart.
28MEDIUM/calendar Team Calendar shows "No Plans Scheduled" despite plans in DB.Wire calendar to plan date ranges.
29MEDIUM/plans table has columns but no data, no row click navigation.Populate from DB, add row click to detail page.
30LOW/plans stat cards render but show static zeros.Replace with live DB aggregates.
17MEDIUM/agents/capabilities returns 404. Dashboard card links to missing page.Create the route or remove the link.
10LOW/.well-known/agent.json returns HTML instead of Agent Card JSON.Add static agent.json to public directory or API route.

Resolved

#ResolvedEvidence
122026-03-07/agents loads with Registry, Workflows, Standards, Register Agent button.
72026-03-07team-activation.sh uses git branch --show-current. No stale branch.
82026-03-07Hook degrades gracefully. timeout 5 + fallback in place.
92026-03-08Spec updated to match WM-NAV.md + working-memory.json.

Progress

Scorecard

Priority Score: 1500 (Pain 5 x Demand 5 x Edge 4 x Trend 5 x Conversion 3)

#Priority (should we?)Preparedness (can we?)
1Pain: 5 — 5min re-briefing + 8min/issue + 0 at L4Principles: 4 — five concerns defined, consolidation done
2Demand: 5 — every agent needs identity, memory, comms, dispatch, verificationPerformance: 2 — 61% capabilities at zero
3Edge: 4 — self-orchestrating platform with settlement verificationPlatform: 4 — 3 CLIs built, Convex deployed, session boot functional
4Trend: 5 — A2A protocol, autonomous agents, multi-repo orchestrationProtocols: 3 — 7 phases defined, Phase 0 complete
5Conversion: 3 — internal infra that unblocks all ventures, indirect but mandatoryPlayers: 3 — 6 agents specified, 8 instruments named
MetricTargetNow
Session recovery time<30s<10s (Phase 0 done)
Flow efficiency>15%4.3%
Capabilities at L2+>60%39% (7/18)
Capabilities at L4>00
ScopePhasesWhat You Get
Phase 00Session boot — DONE
MVP0-1+ drmg CLI + 8 VVFL auditors
V10-2+ Issue CLI + Plans UI + priority dispatch
Quality0-3+ receipts, hook tracker, scaffolds, maturity scorer
Settlement0-4+ settlement verification, trust ledger, /loop
Learning0-5+ pattern extraction, memory writing, action generation
API + A2A0-6+ HTTP transport, Agent Cards, external mesh

Kill date: none — mandatory infrastructure.

Kill signals:

  • If agents with memory aren't measurably faster than agents without it, the memory is noise
  • If compiler takes >10s or injects >2000 tokens, pare down scope
  • Engineering still edits issues-log.md manually 2 sprints after CLI ships
  • If settlement verification adds >30s to each commit cycle, simplify scope

Blocks: All agent instances (Sales Dev, Content Amplifier, future ventures).

Ledger

Context

Questions

What is the minimum platform surface that makes every subsequent PRD faster to build?

  • If Phase 0 eliminated re-briefing, which remaining friction is the next 80/20 win?
  • At what point does settlement verification become more valuable than building new features?
  • When an issue sits unrouted for a sprint, is that a tooling problem or a process problem?
  • How do you measure whether the quality loop (Phase 3) is preventing defects or just catching them?