Agent Platform Phase 1 Spec

What do the 8 dimensions actually measure — and what does a healthy reading look like?

The 8 Dimensions

The VVFL Dashboard instrument reads enforcement health across 8 dimensions. Each dimension is an auditor — a function that takes evidence and returns findings.

#	Dimension	Measures	Input Source	Healthy Signal	Unhealthy Signal
1	Generator	Generated code correctness	Generated files vs schema definitions	Zero incidents in generated code	Bugs trace to generator output
2	Template	Plan template sequence enforcement	plan.json ordering, prdRef, bookends	No phases skipped, prdRef populated	Tasks out of order, empty prdRef
3	Rule	Rules followed when loaded	`.claude/rules/` coverage vs incidents	Rule-covered incidents near zero	Rules exist but violations recur
4	Skill	Skills invoked when relevant	Trigger conditions vs invocation count	Invocation matches triggers	Skills exist but never invoked
5	Agent	Agents stay within boundaries	Output vs declared autonomy scope	Zero out-of-scope changes	Edits outside blast radius
6	Platform	Infrastructure prevents violations	Hook fire count vs violations shipped	Hooks catch before commit	Violations reach CI
7	Virtue	Loop improves over time	Enforcement tier distribution trend	More caught by generators, less by expertise	Expertise catches staying flat
8	Pattern	Patterns extracted and codified	Repeated incidents vs new prevention	Every 2x pattern becomes prevention	Same error class appears 3+ times

Source: A&ID Instrument Registry — VVFL Dashboard row (line 151).

Dimension Detail

1. Generator

Field	Value
Input	Diff of generated files against schema definitions
Measurement	Count of incidents where root cause is generator output
Output	`{ dimension: "generator", finding: "...", evidence: { file, line } }`
Routing	Fix the generator, not the generated code

2. Template

Field	Value
Input	plan.json phase ordering, prdRef field, bookend presence
Measurement	Plans with skipped phases, empty prdRef, missing bookends
Output	`{ dimension: "template", finding: "...", evidence: { file, expected, actual } }`
Routing	Update plan template gates in `template.json`

3. Rule

Field	Value
Input	`.claude/rules/` directory vs incidents in rule-covered areas
Measurement	Incidents where a rule exists but wasn't followed
Output	`{ dimension: "rule", finding: "...", evidence: { rule_file, incident } }`
Routing	If recurring: escalate to hook. If ambiguous: rewrite rule for clarity

4. Skill

Field	Value
Input	Skill trigger conditions vs actual invocation count
Measurement	Situations where a skill should have fired but didn't
Output	`{ dimension: "skill", finding: "...", evidence: { skill_name, trigger, missed } }`
Routing	Improve trigger visibility or convert to hook

5. Agent

Field	Value
Input	Agent output files vs declared autonomy scope in agent definition
Measurement	Edits to files outside the agent's declared blast radius
Output	`{ dimension: "agent", finding: "...", evidence: { agent, file, scope } }`
Routing	Tighten agent definition or expand scope with justification

6. Platform

Field	Value
Input	Hook fire count vs violations that reached commit or CI
Measurement	Ratio of caught-at-hook vs escaped-to-CI
Output	`{ dimension: "platform", finding: "...", evidence: { hook, violation } }`
Routing	Add or fix hook. Every CI-caught violation = missing hook

7. Virtue

Field	Value
Input	Historical enforcement tier distribution over time
Measurement	Trend: are more incidents caught by higher tiers (generator, template) vs lower (expertise)?
Output	`{ dimension: "virtue", finding: "...", evidence: { period, distribution } }`
Routing	If flat: enforcement push-up isn't working. Review retrospectives

8. Pattern

Field	Value
Input	Incident history — same error class appearing more than once
Measurement	Count of repeated error classes without structural prevention
Output	`{ dimension: "pattern", finding: "...", evidence: { error_class, count, prevention } }`
Routing	2x = create prevention artifact. 3x = escalate to generator

Audit Output Schema

Every auditor produces findings in this shape. Forward-compatible with Phase 3 receipts.

{
  "dimension": "generator|template|rule|skill|agent|platform|virtue|pattern",
  "severity": "info|warning|critical",
  "finding": "Human-readable description",
  "evidence": {
    "file": "path/to/file",
    "line": 42,
    "expected": "what should be there",
    "actual": "what is there",
  },
  "routing": {
    "action": "fix|escalate|create",
    "target": "path/to/artifact",
    "owner": "role or team",
  },
  "gap_type": "gate-bypass|template-bloat|sequence-violation|interface-drift|demand-absence",
}

Gaps to Dimensions

The five engineering gaps map to dimensions that catch them.

Gap Type	Primary Dimension	Secondary Dimension	Detection Method
Gate bypass	Template	Rule	Empty prdRef, missing bookends in plan.json
Template bloat	Generator	Template	Mechanical tasks consuming plan slots
Sequence violation	Generator	Template	E2E tests before UI, retrofitted testids
Interface drift	Generator	Pattern	Enum count mismatch across definition sites
Demand absence	Rule	Template	Plan created without prdRef or Tight Five ref

Story Contract

Stories are test contracts. Each row is converted to ≥1 test file by engineering. Tests must be RED before implementation starts. Tests going GREEN = value delivered.

| # | WHEN (Trigger + Precondition) | THEN (Exact Assertion — names data source, field, threshold) | ARTIFACT (Test File) | Test Type | FORBIDDEN (Must not happen) | OUTCOME (Value Proven) | | --- | ------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------- | ------------------------------------------ | ------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------ | ---------------------------------------------------------------- | | S1 | drmg audit --dry-run runs against a repo where .claude/hooks/failure-log.jsonl contains ≥1 failure entry | Output JSON has findings[] where ≥1 entry has dimension: "platform" AND evidence.source references failure-log.jsonl AND severity is warning | critical | drmg/__tests__/story-s1-platform.spec.ts | integration | A stub returning hardcoded findings passes — test must seed a real failure-log.jsonl and read it | Platform auditor reads actual hook failure data, not empty stubs | | S2 | An audit finding is produced with a routing field | finding.routing.target resolves to an existing file path on disk AND finding.routing.owner matches a value in the AssignedTeam enum | drmg/__tests__/story-s2-routing.spec.ts | unit | Finding with routing.target: "" or routing.owner: "unknown" is accepted as valid | Findings route to real artifacts, not phantom paths | | S3 | A plan record exists in DB with prdRef: null | auditTemplate([plan]) returns ≥1 finding where severity: "critical" AND evidence.field: "prdRef" AND evidence.actual: null | drmg/__tests__/story-s3-template.spec.ts | integration | A plan with null prdRef produces no finding, or produces an info finding only | Template auditor enforces prdRef is populated before plan runs | | S4 | A pgEnum in the schema has 3 values; the corresponding TypeScript union in generated types has 5 values | auditGenerator(input) returns ≥1 finding where evidence.pgEnumCount: 3 AND evidence.tsUnionCount: 5 AND evidence.enumName names the specific enum | drmg/__tests__/story-s4-generator.spec.ts | unit | Mismatch exists but auditor returns empty findings[] | Generator auditor detects drift between schema and generated types | | S5 | 3 prior audit result records exist in DB with increasing generator tier catches across runs 1→2→3 | auditVirtue(runs) returns finding where evidence.trend: "improving" AND evidence.periods contains data from all 3 runs | drmg/__tests__/story-s5-virtue.spec.ts | integration | Virtue auditor runs with 0 or 1 run in DB and returns a non-empty finding (must require ≥3 runs) | Virtue auditor proves the enforcement loop is improving over time |

Build Contract

Success Test = the test in the Story Contract that goes GREEN when this row is done. Safety Test = what must NOT happen.

#	ID	Function	Artifact	Success Test (Story ref + specific assertion)	Safety Test (Forbidden Outcome)	Value	State
1	AGNT-001	Shared DB context (reuse plan-cli pattern)	`db-context.ts`	`createDbContext()` returns object with typed `.db` property — getDiagnostics: 0 errors	Each drmg module defines its own DB connection	No per-module DB setup	Gap
2	AGNT-001	Thin router dispatches to handlers	`drmg.ts` entry point	`drmg audit` invokes `commands/audit.ts`; `drmg plan` invokes `commands/plan.ts` — verified via integration test with mock handlers	Unknown subcommand silently does nothing instead of printing help + exit 1	One CLI, many commands	Gap
3	AGNT-002	Generator auditor	`auditors/generator.ts`	S4: seed pgEnum=3 values, TS union=5 values → `auditGenerator()` returns finding with `evidence.pgEnumCount:3, tsUnionCount:5`	Returns empty findings when enum mismatch exists	Catch generator bugs at source	Gap
4	AGNT-002	Template auditor	`auditors/template.ts`	S3: plan with `prdRef:null` → `auditTemplate()` returns finding with `severity:"critical", evidence.field:"prdRef"`	Plan with null prdRef produces no finding or produces only `info` severity	Enforce plan discipline	Gap
5	AGNT-002	Rule auditor	`auditors/rule.ts`	Incident matching a rule in `.claude/rules/` → `auditRules()` returns finding with `evidence.rule_file` naming the covering rule	Incident in rule-covered area produces no finding	Measure rule effectiveness	Gap
6	AGNT-002	Skill auditor	`auditors/skill.ts`	Session with trigger condition matched but skill not invoked → `auditSkills()` returns finding with `evidence.trigger` named	Returns empty findings when `totalSessions: 0` — must produce info-level data gap	Know what skills aren't pulling weight	Gap
7	AGNT-002	Agent auditor	`auditors/agent.ts`	Changed file outside declared blast radius → `auditAgent()` returns finding with `evidence.file` and `evidence.scope` both named	Changed files that violate scope produce no finding	Enforce agent boundaries	Gap
8	AGNT-002	Platform auditor	`auditors/platform.ts`	S1: seed `failure-log.jsonl` with ≥1 entry → `auditPlatform()` returns finding where `evidence.source` references `failure-log.jsonl`	Returns empty findings when `failure-log.jsonl` has entries — stub not wired	Every CI failure = missing hook	Gap
9	AGNT-002	Virtue auditor	`auditors/virtue.ts`	S5: 3 audit runs in DB with improving generator tier → `auditVirtue()` returns finding with `evidence.trend:"improving"`	Runs with <3 records produce a non-empty finding (must require ≥3 runs)	Prove the loop improves	Gap
10	AGNT-002	Pattern auditor	`auditors/pattern.ts`	3 incidents of same error class → `auditPattern()` returns finding with `severity:"critical"` and `gap_type:"gate-bypass"`	Same error class at 3x count produces no finding, or produces `gap_type:"interface-drift"`	No class recurs without structure	Gap
11	AGNT-002	Audit command with `--dry-run`	`commands/audit.ts`	S1+S3+S4: with seeded test data, `drmg audit --dry-run` outputs JSON with findings from platform, template, and generator dimensions	Outputs valid JSON with zero findings when test data is seeded	One command, full health picture	Gap
12	PLAT-005	DB-native plan template tables	schema migration + seed	`SELECT COUNT() FROM planning_task_templates WHERE best_pattern_prompt IS NULL` = 0 AND `SELECT COUNT() FROM planning_plan_templates` = 33	Template created with `best_pattern_prompt: null` is accepted by DB insert	Schema enforces prompt quality — AGNT-007 writes to DB not JSON	Gap
13	PLAT-005	`plan-cli.ts create` reads from DB	`plan-cli.ts`	`plan-cli.ts create --template=a2a-api-intent-validation --dry-run` succeeds with no JSON file on disk — tasks include `bestPatternPrompt` from DB	Falls back to JSON when DB record exists	One source of truth for templates	Gap

Context

A&ID Instrument Registry — Source of truth for the 8 dimension names
Flow Engineering — Enforcement hierarchy and cost of quality
Retrospective Protocol — Five gap types and routing logic
VVFL — Standards station = gauge
VVFL Evolution — Reflect station = controller
Agent Platform PRD — Parent PRD with full phase map

Questions

When the virtue dimension shows a flat trend, is the problem the retrospectives or the routing?

If a generator auditor finds zero issues, does that mean the generator is perfect — or that the auditor's input source is wrong?
At what point does an 8-dimension audit become overhead rather than prevention?
Which dimension catches the most findings in the first 5 runs — and does that reveal the weakest enforcement tier?

The 8 Dimensions​

Dimension Detail​

1. Generator​

2. Template​

3. Rule​

4. Skill​

5. Agent​

6. Platform​

7. Virtue​

8. Pattern​

Audit Output Schema​

Gaps to Dimensions​

Story Contract​

Build Contract​

Context​

Questions​