← Agent Platform · Prompt Deck · Pictures

Agent Platform Phase 1 Spec

What do the 8 dimensions actually measure — and what does a healthy reading look like?

The 8 Dimensions

The VVFL Dashboard instrument reads enforcement health across 8 dimensions. Each dimension is an auditor — a function that takes evidence and returns findings.

#	Dimension	Measures	Input Source	Healthy Signal	Unhealthy Signal
1	Generator	Generated code correctness	Generated files vs schema definitions	Zero incidents in generated code	Bugs trace to generator output
2	Template	Plan template sequence enforcement	plan.json ordering, prdRef, bookends	No phases skipped, prdRef populated	Tasks out of order, empty prdRef
3	Rule	Rules followed when loaded	`.claude/rules/` coverage vs incidents	Rule-covered incidents near zero	Rules exist but violations recur
4	Skill	Skills invoked when relevant	Trigger conditions vs invocation count	Invocation matches triggers	Skills exist but never invoked
5	Agent	Agents stay within boundaries	Output vs declared autonomy scope	Zero out-of-scope changes	Edits outside blast radius
6	Platform	Infrastructure prevents violations	Hook fire count vs violations shipped	Hooks catch before commit	Violations reach CI
7	Virtue	Loop improves over time	Enforcement tier distribution trend	More caught by generators, less by expertise	Expertise catches staying flat
8	Pattern	Patterns extracted and codified	Repeated incidents vs new prevention	Every 2x pattern becomes prevention	Same error class appears 3+ times

Source: A&ID Instrument Registry — VVFL Dashboard row (line 151).

Dimension Detail

1. Generator

Field	Value
Input	Diff of generated files against schema definitions
Measurement	Count of incidents where root cause is generator output
Output	`{ dimension: "generator", finding: "...", evidence: { file, line } }`
Routing	Fix the generator, not the generated code

2. Template

Field	Value
Input	plan.json phase ordering, prdRef field, bookend presence
Measurement	Plans with skipped phases, empty prdRef, missing bookends
Output	`{ dimension: "template", finding: "...", evidence: { file, expected, actual } }`
Routing	Update plan template gates in `template.json`

3. Rule

Field	Value
Input	`.claude/rules/` directory vs incidents in rule-covered areas
Measurement	Incidents where a rule exists but wasn't followed
Output	`{ dimension: "rule", finding: "...", evidence: { rule_file, incident } }`
Routing	If recurring: escalate to hook. If ambiguous: rewrite rule for clarity

4. Skill

Field	Value
Input	Skill trigger conditions vs actual invocation count
Measurement	Situations where a skill should have fired but didn't
Output	`{ dimension: "skill", finding: "...", evidence: { skill_name, trigger, missed } }`
Routing	Improve trigger visibility or convert to hook

5. Agent

Field	Value
Input	Agent output files vs declared autonomy scope in agent definition
Measurement	Edits to files outside the agent's declared blast radius
Output	`{ dimension: "agent", finding: "...", evidence: { agent, file, scope } }`
Routing	Tighten agent definition or expand scope with justification

6. Platform

Field	Value
Input	Hook fire count vs violations that reached commit or CI
Measurement	Ratio of caught-at-hook vs escaped-to-CI
Output	`{ dimension: "platform", finding: "...", evidence: { hook, violation } }`
Routing	Add or fix hook. Every CI-caught violation = missing hook

7. Virtue

Field	Value
Input	Historical enforcement tier distribution over time
Measurement	Trend: are more incidents caught by higher tiers (generator, template) vs lower (expertise)?
Output	`{ dimension: "virtue", finding: "...", evidence: { period, distribution } }`
Routing	If flat: enforcement push-up isn't working. Review retrospectives

8. Pattern

Field	Value
Input	Incident history — same error class appearing more than once
Measurement	Count of repeated error classes without structural prevention
Output	`{ dimension: "pattern", finding: "...", evidence: { error_class, count, prevention } }`
Routing	2x = create prevention artifact. 3x = escalate to generator

Audit Output Schema

Every auditor produces findings in this shape. Forward-compatible with Phase 3 receipts.

{
  "dimension": "generator|template|rule|skill|agent|platform|virtue|pattern",
  "severity": "info|warning|critical",
  "finding": "Human-readable description",
  "evidence": {
    "file": "path/to/file",
    "line": 42,
    "expected": "what should be there",
    "actual": "what is there",
  },
  "routing": {
    "action": "fix|escalate|create",
    "target": "path/to/artifact",
    "owner": "role or team",
  },
  "gap_type": "gate-bypass|template-bloat|sequence-violation|interface-drift|demand-absence",
}

Gaps to Dimensions

The five engineering gaps map to dimensions that catch them.

Gap Type	Primary Dimension	Secondary Dimension	Detection Method
Gate bypass	Template	Rule	Empty prdRef, missing bookends in plan.json
Template bloat	Generator	Template	Mechanical tasks consuming plan slots
Sequence violation	Generator	Template	E2E tests before UI, retrofitted testids
Interface drift	Generator	Pattern	Enum count mismatch across definition sites
Demand absence	Rule	Template	Plan created without prdRef or Tight Five ref

Story Contract

Stories are test contracts. Tests must be RED before implementation starts. GREEN = value delivered.

S1 — Platform auditor reads real hook failures

Trigger: drmg audit --dry-run runs against a repo where .claude/hooks/failure-log.jsonl contains ≥1 failure entry

Checklist:

Output JSON has findings[] with ≥1 entry where dimension: "platform"
evidence.source references failure-log.jsonl (not a hardcoded stub)
severity is warning or critical — not info
Test seeds a real failure-log.jsonl and reads it — hardcoded stub does not pass

Forbidden: Stub returning hardcoded findings passes. Auditor produces findings without reading failure-log.jsonl.

Evidence: integration — drmg/__tests__/story-s1-platform.spec.ts

Commission Result: ⬜ PASS / ⬜ FAIL Notes: (findings)

S2 — Routing fields resolve to real artifacts

Trigger: An audit finding is produced with a routing field

Checklist:

finding.routing.target resolves to an existing file path on disk
finding.routing.owner matches a value in the AssignedTeam enum
Finding with routing.target: "" or routing.owner: "unknown" is rejected as invalid

Forbidden: Finding with empty target or "unknown" owner accepted as valid.

Evidence: unit — drmg/__tests__/story-s2-routing.spec.ts

Commission Result: ⬜ PASS / ⬜ FAIL Notes: (findings)

S3 — Template auditor flags null prdRef as critical

Trigger: A plan record exists in DB with prdRef: null

Checklist:

auditTemplate([plan]) returns ≥1 finding where severity: "critical"
evidence.field: "prdRef" named in the finding
evidence.actual: null present
Plan with null prdRef does not produce zero findings or info-only findings

Forbidden: Plan with null prdRef produces no finding, or only info severity.

Evidence: integration — drmg/__tests__/story-s3-template.spec.ts

Commission Result: ⬜ PASS / ⬜ FAIL Notes: (findings)

S4 — Generator auditor detects enum drift

Trigger: pgEnum in schema has 3 values; TypeScript union in generated types has 5 values

Checklist:

auditGenerator(input) returns ≥1 finding with evidence.pgEnumCount: 3
evidence.tsUnionCount: 5 present in the finding
evidence.enumName names the specific enum that drifted
Mismatch does not produce empty findings[]

Forbidden: Mismatch exists but auditor returns empty findings[].

Evidence: unit — drmg/__tests__/story-s4-generator.spec.ts

Commission Result: ⬜ PASS / ⬜ FAIL Notes: (findings)

S5 — Virtue auditor proves the loop is improving

Trigger: 3 prior audit result records exist in DB with increasing generator tier catches across runs 1→2→3

Checklist:

auditVirtue(runs) returns finding where evidence.trend: "improving"
evidence.periods contains data from all 3 runs
Auditor with 0 or 1 run in DB produces no non-empty finding — requires ≥3 runs
Flat or declining trend does not return "improving"

Forbidden: Virtue auditor runs with 0 or 1 run and returns a non-empty finding.

Evidence: integration — drmg/__tests__/story-s5-virtue.spec.ts

Commission Result: ⬜ PASS / ⬜ FAIL Notes: (findings)

Build Contract

Success Test = the test in the Story Contract that goes GREEN when this row is done. Safety Test = what must NOT happen.

#	ID	Function	Artifact	Success Test (Story ref + specific assertion)	Safety Test (Forbidden Outcome)	Value	State
1	AGNT-001	Shared DB context (reuse plan-cli pattern)	`db-context.ts`	`createDbContext()` returns object with typed `.db` property — getDiagnostics: 0 errors	Each drmg module defines its own DB connection	No per-module DB setup	Gap
2	AGNT-001	Thin router dispatches to handlers	`drmg.ts` entry point	`drmg audit` invokes `commands/audit.ts`; `drmg plan` invokes `commands/plan.ts` — verified via integration test with mock handlers	Unknown subcommand silently does nothing instead of printing help + exit 1	One CLI, many commands	Gap
3	AGNT-002	Generator auditor	`auditors/generator.ts`	S4: seed pgEnum=3 values, TS union=5 values → `auditGenerator()` returns finding with `evidence.pgEnumCount:3, tsUnionCount:5`	Returns empty findings when enum mismatch exists	Catch generator bugs at source	Gap
4	AGNT-002	Template auditor	`auditors/template.ts`	S3: plan with `prdRef:null` → `auditTemplate()` returns finding with `severity:"critical", evidence.field:"prdRef"`	Plan with null prdRef produces no finding or produces only `info` severity	Enforce plan discipline	Gap
5	AGNT-002	Rule auditor	`auditors/rule.ts`	Incident matching a rule in `.claude/rules/` → `auditRules()` returns finding with `evidence.rule_file` naming the covering rule	Incident in rule-covered area produces no finding	Measure rule effectiveness	Gap
6	AGNT-002	Skill auditor	`auditors/skill.ts`	Session with trigger condition matched but skill not invoked → `auditSkills()` returns finding with `evidence.trigger` named	Returns empty findings when `totalSessions: 0` — must produce info-level data gap	Know what skills aren't pulling weight	Gap
7	AGNT-002	Agent auditor	`auditors/agent.ts`	Changed file outside declared blast radius → `auditAgent()` returns finding with `evidence.file` and `evidence.scope` both named	Changed files that violate scope produce no finding	Enforce agent boundaries	Gap
8	AGNT-002	Platform auditor	`auditors/platform.ts`	S1: seed `failure-log.jsonl` with ≥1 entry → `auditPlatform()` returns finding where `evidence.source` references `failure-log.jsonl`	Returns empty findings when `failure-log.jsonl` has entries — stub not wired	Every CI failure = missing hook	Gap
9	AGNT-002	Virtue auditor	`auditors/virtue.ts`	S5: 3 audit runs in DB with improving generator tier → `auditVirtue()` returns finding with `evidence.trend:"improving"`	Runs with <3 records produce a non-empty finding (must require ≥3 runs)	Prove the loop improves	Gap
10	AGNT-002	Pattern auditor	`auditors/pattern.ts`	3 incidents of same error class → `auditPattern()` returns finding with `severity:"critical"` and `gap_type:"gate-bypass"`	Same error class at 3x count produces no finding, or produces `gap_type:"interface-drift"`	No class recurs without structure	Gap
11	AGNT-002	Audit command with `--dry-run`	`commands/audit.ts`	S1+S3+S4: with seeded test data, `drmg audit --dry-run` outputs JSON with findings from platform, template, and generator dimensions	Outputs valid JSON with zero findings when test data is seeded	One command, full health picture	Gap
12	PLAT-005	DB-native plan template tables	schema migration + seed	`SELECT COUNT() FROM planning_task_templates WHERE best_pattern_prompt IS NULL` = 0 AND `SELECT COUNT() FROM planning_plan_templates` = 33	Template created with `best_pattern_prompt: null` is accepted by DB insert	Schema enforces prompt quality — AGNT-007 writes to DB not JSON	Gap
13	PLAT-005	`plan-cli.ts create` reads from DB	`plan-cli.ts`	`plan-cli.ts create --template=a2a-api-intent-validation --dry-run` succeeds with no JSON file on disk — tasks include `bestPatternPrompt` from DB	Falls back to JSON when DB record exists	One source of truth for templates	Gap

Context

PRD Index — Agent Platform
Prompt Deck — 5-card pitch
Pictures — Pre-flight maps

Questions

When the virtue dimension shows a flat trend, is the problem the retrospectives or the routing?

If a generator auditor finds zero issues, does that mean the generator is perfect — or that the auditor's input source is wrong?
At what point does an 8-dimension audit become overhead rather than prevention?
Which dimension catches the most findings in the first 5 runs — and does that reveal the weakest enforcement tier?

The 8 Dimensions​

Dimension Detail​

1. Generator​

2. Template​

3. Rule​

4. Skill​

5. Agent​

6. Platform​

7. Virtue​

8. Pattern​

Audit Output Schema​

Gaps to Dimensions​

Story Contract​

S1 — Platform auditor reads real hook failures​

S2 — Routing fields resolve to real artifacts​

S3 — Template auditor flags null prdRef as critical​

S4 — Generator auditor detects enum drift​

S5 — Virtue auditor proves the loop is improving​

Build Contract​

Context​

Questions​

The 8 Dimensions

Dimension Detail

1. Generator

2. Template

3. Rule

4. Skill

5. Agent

6. Platform

7. Virtue

8. Pattern

Audit Output Schema

Gaps to Dimensions

Story Contract

S1 — Platform auditor reads real hook failures

S2 — Routing fields resolve to real artifacts

S3 — Template auditor flags null prdRef as critical

S4 — Generator auditor detects enum drift

S5 — Virtue auditor proves the loop is improving

Build Contract

Context

Questions