← Agentic Execution

Agentic Execution Spec

What does "the right capabilities, safely cleared, fully logged" look like as a test?

Data Model

Plan
  requiredSkills: string[]        # Claude Code skill names — only these load
  requiredMcpTools: string[]      # MCP server names — only these activate
  assignedAgents: AgentRef[]      # team roster for this plan
  phases[]
    assignedAgent: string         # agent for all tasks in this phase
    tasks[]
      assignedAgent: string       # agent override for this task
      bestPatternPrompt: string   # must reference /skill-name or mcp-tool or nx-generator explicitly
      tools: string[]
      skills: string[]

Issue (DB)
  id, planId, taskId, severity (low|medium|high|critical),
  ownerTeam, description, createdAt, resolvedAt?, resolution?

Decision (DB)
  id, planId, taskId, title, reasoning, alternatives: string[], createdAt

Snapshot (WM markdown)
  plan, task, timestamp
  failingTests: { file, testName, error }[]
  techDebt: { file, what, whyDeferred }[]
  openDecisions: string[]
  nextTaskIntent: string

Story Contract

Stories are test contracts. RED before implementation. GREEN = value delivered.

#	WHEN (Trigger + Precondition)	THEN (Exact Assertion)	ARTIFACT	Test Type	FORBIDDEN	OUTCOME
S1	`plan.json` has `requiredSkills: ["content-flow", "truth-seeker"]` and plan activation runs	Capability loader resolves exactly 2 skill definitions — no other skills present in loaded context	`exec/__tests__/story-s1-scoped-load.spec.ts`	unit	Activation loads all skills in registry when `requiredSkills` is set	Only declared skills consume context
S2	`plan-cli log-issue --severity=high --task=t02 --owner=intel --description="typecheck fails on schema"` runs	DB has 1 row in `planning_issues` where `severity='high'` AND `taskId='t02'` AND `description` matches	`exec/__tests__/story-s2-log-issue.spec.ts`	integration	Issue is written to markdown file instead of DB, or accepted with `severity: null`	Issues are queryable by task and severity
S3	`plan-cli log-decision --title="NOT NULL for bestPatternPrompt" --reasoning="nullable allowed missing prompts for weeks" --alternatives="optional field,documentation only"` runs	DB has 1 row in `planning_decisions` where `title` matches AND `alternatives` is a non-empty array with 2 entries	`exec/__tests__/story-s3-log-decision.spec.ts`	integration	Decision accepted without `reasoning` or with empty `alternatives: []`	Decisions have structured reasoning — not just a title
S4	Task save-state snapshot is written to WM markdown, then `/clear` runs, then `plan-cli load-snapshot` runs	Loaded snapshot object has `failingTests[]`, `techDebt[]`, `openDecisions[]`, `nextTaskIntent` — all fields present and matching what was saved	`exec/__tests__/story-s4-snapshot.spec.ts`	unit	Snapshot missing any of the 4 fields is accepted as valid, or load-snapshot silently ignores missing fields	/clear is always safe — full context recoverable
S5	Context token usage reaches 70% of budget	`contextBudgetGauge()` returns `{ level: "warning", percent: number, recommendation: "save-state" }` AND at 85% returns `{ level: "critical", recommendation: "save-and-yield" }`	`exec/__tests__/story-s5-budget.spec.ts`	unit	Budget gauge returns `{ level: "ok" }` when percent >= 70, or only returns one level not both thresholds	Agents know when to save before context is gone

Build Contract

#	ID	Function	Artifact	Success Test	Safety Test	Value	State
1	EXEC-001	Plan capability manifest schema	`plan.json` schema extension	`plan.json` with `requiredSkills: ["x"]` passes Zod validation; plan without it also validates (backward compat)	Plan with `requiredSkills: null` is accepted as valid	Manifest is the single source of required capabilities	Gap
2	EXEC-004	Issue log CLI command	`plan-cli.ts log-issue`	S2: `log-issue` with all fields → DB row with correct severity and taskId	Missing `--severity` silently inserts `severity: null`	Issues queryable by task, severity, team	Gap
3	EXEC-005	Decision log CLI command	`plan-cli.ts log-decision`	S3: `log-decision` with reasoning + 2 alternatives → DB row with array field populated	Decision accepted without `--reasoning` flag	Decisions have structured reasoning	Gap
4	EXEC-003	Task save-state schema	`types/snapshot.ts` + `plan-cli.ts save-snapshot`	S4: snapshot written with all 4 fields → `load-snapshot` returns typed object with all fields non-null	Snapshot accepted with any fields missing	/clear is always safe	Gap
5	EXEC-002	Scoped capability loader	`plan-activation.ts`	S1: plan with `requiredSkills: ["content-flow"]` → activation resolves exactly 1 skill, not all skills	Loader silently falls back to full skill set when manifest is set	Context budget protected per plan	Gap
6	EXEC-006	Context budget gauge	`context-budget.ts`	S5: gauge at 70% → `level: "warning"`, at 85% → `level: "critical"`, with correct `recommendation` field	Gauge returns `level: "ok"` when percent ≥ 70	Agents know when to yield	Gap
7	EXEC-007	Task boundary protocol in template	`CLAUDE.md` task boundary section + plan template update	Every plan template has step 0 (load snapshot if exists) and step 99 (save snapshot + log unresolved issues)	Template accepted without bookend steps	/clear between tasks is standard, not exceptional	Gap
8	EXEC-008	Explicit skill invocation directives	`bestPatternPrompt` convention + validator	`plan-cli validate-prompts` returns 0 errors on a plan where every task's `bestPatternPrompt` contains at least one `/skill-name` or `mcp:tool-name` reference	Prompt with no skill reference passes validation	Agents invoke skills, not rediscover them	Gap

Context

PRD Index — Agentic Execution

Task Boundary Protocol (Standard Steps)

Every plan template must include these as the first and last steps in every task:

Step 0 (always first):

LOAD CONTEXT: Check for snapshot from previous task (plan-cli load-snapshot --task=<id>).
If snapshot exists: read failingTests, techDebt, openDecisions, nextTaskIntent before starting.

Step 99 (always last):

SAVE STATE: Before finishing this task:
1. plan-cli log-issue for any unresolved issues (non-zero severity)
2. plan-cli log-decision for any architectural choices made
3. plan-cli save-snapshot --task=<id> with failingTests, techDebt, openDecisions, nextTaskIntent
/clear is now safe.

Skill Invocation Directive Convention

bestPatternPrompt must reference the mechanism explicitly — not by description:

Reference Type	Format	Example
Claude Code skill	`/skill-name`	`Use /content-flow to review this page`
MCP tool	`mcp:server-name:tool-name`	`Use mcp:context7:query-docs to fetch docs`
NX generator	`nx generate @scope/plugin:gen`	`Run nx generate @stackmates/schema:table`
CLI command	`npx tsx tools/scripts/...`	`Run npx tsx tools/scripts/planning/plan-cli.ts list-templates`

A bestPatternPrompt with no explicit reference is incomplete. The validator (plan-cli validate-prompts) enforces this.

Context

PRD Index — Agentic Execution

Questions

When a plan has 8 tasks and 3 of them need a skill the other 5 don't, does the manifest load it for all 8 — or does each task declare its own override?

If validate-prompts fails on 20 of 33 templates, is that a sprint blocker or a background cleanup task?
What happens to issues logged mid-task that turn out to be non-issues by task end — does the log need an auto-resolved state?
When the context budget gauge reaches critical, who decides whether to yield: the agent or the human?

Data Model​

Story Contract​

Build Contract​

Context​

Task Boundary Protocol (Standard Steps)​

Skill Invocation Directive Convention​

Context​

Questions​

Data Model

Story Contract

Build Contract

Context

Task Boundary Protocol (Standard Steps)

Skill Invocation Directive Convention

Context

Questions