L2inner-loop

Agent Platform

When agents need identity, memory, scaffold generators, and boundary enforcement to operate safely and improve autonomously — the PUMP that powers the factory.

1,200

Priority Score

Pain × Demand × Edge × Trend × Conversion

Customer Journey

Why should I care?

Five cards that sell the dream

1Why

Agents repeat, humans correct.

What's the cost of agents that can't remember?

The friction: 67 CLI commands, 542 tests, 8 auditor dimensions. But agents create files in wrong directories, repeat the same mistakes across sessions, and can't recall what worked before.

The desire: Agents that know who they are, what they can touch, and what they learned last time. Identity + memory + boundaries = trust.

The proof: Every inner-loop PRD depends on agent quality. The factory runs on agents. If agents don't improve, the factory doesn't improve.

The stories behind agent friction

2Evidence

Five issues, one page.

If the dashboard lies, can you trust the system?

The friction: Plans dashboard math is wrong. Active count includes completed. No project grouping. No drill-down. Five issues filed (#26-30). The surface doesn't reflect reality.

The desire: Dashboard matches plan-cli exactly. Click to drill down. Group by project. See which projects are progressing, which are stalled.

The proof: plan-cli dashboard returns correct data. The UI just needs to match what the CLI already knows.

The outcome map

3Platform

Scaffold exists, CLI doesn't.

How many times do you copy-paste before you automate?

The friction: scaffold-generators.ts has the functions. No CLI surface. Creating a new skill: copy existing, rename, update 6 fields, fix imports. 15 minutes. Third time this week.

The desire: drmg scaffold skill my-skill creates a valid file. Content-type registry: skill, hook, agent, command, rule. Each with its own template.

The proof: The functions are built. The CLI binary exists. The wire is one integration — not a new architecture.

The dependency map

4Loop

Three phases, five features.

Can you fix the surface before you automate the depth?

The friction: Phase 1 (Surface): Plans UI math wrong, no grouping, no drill-down. Phase 2 (Prevention): scaffolds exist but no CLI, boundaries implicit not declared. Phase 3 (Learning): patterns detected but don't compound, memory schema exists but no pipeline.

The desire: Fix what you can see first. Then prevent what you can predict. Then learn from what you couldn't predict.

The proof: Phase 1 acceptance: dashboard math matches CLI output. If it doesn't match, the query is wrong — fix before expanding scope.

The build order

5People

Agents declare, hooks enforce.

What if violations became the training data for better boundaries?

The friction: Agent edits a file outside its responsibility. Dream-team agent writes to engineering repo. No warning. Discovered days later. No log, no pattern.

The desire: Per-agent scope.json declares allowed paths. Pre-edit hook checks scope. Human always overrides. Violations compound into rules via pattern extractor.

The proof: src-post-edit.sh already enforces design system constraints. Same pattern, scoped per agent instead of per file type.

The boundary stories

1 / 5

Same five positions. Different seat. The operator asks "can I trust the dashboard?" The agent asks "what am I allowed to touch?"

Feature Dev Journey

How do we build this?

Five cards that sell the process

1Job

Fix the surface first.

What already exists that just needs wiring?

11 build rows across 3 jobs. Plans UI fix (3 rows). Scaffold CLI + boundaries (5 rows). Pattern extraction + memory (3 rows). ~60% wiring, ~40% new.

The build order

1 / 5

Situation

67+ CLI commands, 8 auditor dimensions, 542+ tests. But agents create files in wrong directories, repeat mistakes across sessions, and can't recall what worked before. Plans dashboard exists but math is wrong. Scaffold generators exist as functions but have no CLI surface.

Intention

One platform where agents have identity (who am I), memory (what do I know), scaffolds (how do I create), and boundaries (what can't I touch). The PUMP that powers every other inner-loop PRD.

Obstacle

Agent capability is scattered across 4 repos, 7 skill files, and 3 database schemas. No unified surface. The boundary between 'agent can do this' and 'agent must not do this' is implicit, not declared.

Hardest Thing

Agent boundaries that are too tight prevent useful work. Too loose and agents break things. The boundary must be declared per-agent, enforced by hooks, and learnable from patterns.

Priority (5P)

5/5

Pain

4/5

Demand

4/5

Edge

5/5

Trend

3/5

Convert

Readiness (5R)

Principles4 / 5

Performance2 / 5

Platform4 / 5

Process3 / 5

Players3 / 5

What Exists

Component	State	Gap
Plans dashboard UI	Stub	Page exists at /plans. Math wrong (5 issues). No drill-down. No project grouping.
Scaffold generator functions	Working	Functions exist in scaffold-generators.ts. Not wired to drmg CLI. No content-type registry.
Agent boundary hooks	Stub	One proof-of-concept (src-post-edit.sh). No per-agent scope declarations.
Virtue auditor (pattern tracking)	Working	8 dimensions track trends. No cross-run extraction. No prevention proposals.
Agent memory DB schema	Working	agent_memory_stores table with vector column exists. No write pipeline. No recall query.
DRMG CLI (67+ commands)	Working	Unified binary works. Scaffold namespace not yet added.
Agent config (.claude/agents/)	Working	Agent definitions exist. No scope declarations per agent.

Kill Signal

Boundary hooks block >30% of legitimate agent actions after 30 days. Agent task completion rate drops below current baseline.

Questions

Where does automation end and human judgment begin for agent boundaries?

If agents can extract their own patterns, will they converge on the same rules humans would write?
Should memory be per-agent or shared across all agents in a session?
At what point does scaffold templating become over-engineering — when does an agent just write the file directly?
Can boundary violations be the training signal for better scope declarations?