PRD Handoff Protocol
How does a specification become a plan that becomes working software?
The gap between "what to build" and "how to build it" is where most projects fail. This protocol defines the contract: what the dream team writes, what engineering reads, and how the parser bridges them.
The Contract
| Team | Owns | Bible | Format |
|---|---|---|---|
| Dream Team | What and Why | template.json (PRD creation workflow) | spec/index.md (FAVV v2.1) |
| Engineering | How | Template Plans (task sequence + reuse) | plan.json with prdRef back to spec |
Each side has one authoritative resource. Dream team follows template.json to produce specs. Engineering follows template plans to produce implementations. The spec is the shared contract — engineering never builds from Slack messages, verbal agreements, or tier checklists without stories.
Ownership Boundary
The dream team writes stories that state what must be true — not how to make it true. "Admin can toggle permissions per entity per role, change persists, unauthorized user denied" is a truth test. Engineering decides whether that's a server action, an API endpoint, or a database trigger.
| Dream Team Decides | Engineering Decides |
|---|---|
| User stories and acceptance criteria | Implementation architecture |
| Forbidden outcomes (safety tests) | Which layer handles enforcement |
| FeatureIDs and scope | Template selection and plan structure |
| Kill date and priority order | Sprint sequencing and task dependencies |
| What "done" looks like | How to get there |
Non-negotiable from both sides: Nx monorepo structure, hexagonal layering, generators for boilerplate, tests before implementation. These are the substrate that makes continuous improvement possible — not implementation choices.
PRD Structure
prd-{name}/
index.md <- Decision surface (10s read)
pictures/ <- Pre-flight maps (thinking instruments)
prompt-deck/ <- Sales compression (2min read)
spec/ <- Engineering spec (30min read)
index.md <- Intent Contract + Story Contract + Build Contract
Spec Anatomy
The project-from-prd parser reads three contracts from spec/index.md:
Intent Contract
9 dimensions defining agent autonomy boundaries. Not parsed into tasks — used for judgment when instructions run out.
| Dimension | Engineering Gets |
|---|---|
| Objective | Problem context for judgment calls |
| Outcomes | Observable state changes to verify |
| Health Metrics | What must NOT degrade (Goodhart guard) |
| Constraints | Hard stops vs steering guidance |
| Autonomy | Allowed / Escalate / Never boundaries |
| Stop Rules | When to stop building, when to halt |
| Counter-metrics | Numbers that must stay stable while optimizing |
| Blast Radius | What systems, data, or users this touches |
| Rollback | How to undo if things go wrong |
Story Contract
Stories convert user intent into testable scenarios. Each becomes a Given/When/Then.
| Column | Engineering Reads As |
|---|---|
| Intention | User-facing state change — what the user experiences, not what the code does |
| Trigger | Observable event — becomes the test's "Given/When" |
| Observable Success | Binary or thresholded — verifiable without the builder present |
| Forbidden Outcome | What must NOT happen — feeds Safety Test in Build Contract |
| Evidence Type | unit / integration / e2e / eval / replay / monitor |
| Escalation | When the agent must stop and ask a human |
Every story must have at least one Forbidden Outcome. This is the counterfeit progress detector — it catches work that looks done but isn't safe.
Story rows map 1:many to Build Contract rows. One story may need multiple features. One feature may serve multiple stories.
Build Contract (FAVV v2.1)
The deliverable. Every row has an acceptance test.
| Column | Engineering Reads As |
|---|---|
| FeatureID | Links to feature-matrix row — the RaaS catalog ID |
| Function | Feature name + behavior. Verbs generate test cases. "Browse, search, filter contacts" = 3 tests |
| Artifact | Concrete deliverable: TypeScript client, PostgreSQL migration, React component |
| Success Test | Happy-path acceptance criteria with thresholds — this IS the e2e test spec |
| Safety Test | What must NOT happen — populated from Story Contract Forbidden Outcomes |
| Regression Test | What existing capability must NOT degrade — names the specific capability + threshold |
| Value | Business outcome in one sentence |
| State | Enum: Live, Built, Dormant, Partial, Not verified, Gap, Stub, Broken |
Function column rule: Use verbs. "Browse, search, filter contacts" generates 3 test cases. "Contacts" generates a BLOCKER — the parser cannot derive tests from nouns.
Job groupings: H3 headings above FAVV table sections group rows by user job. Each heading includes the FeatureIDs that job advances.
Frozen Scope
A PRD's scope is the set of FeatureIDs in its Build Contract at registration time. That set is frozen.
- Max 5 distinct FeatureIDs per PRD
- Adding a new FeatureID after registration requires a new PRD
- The Feature Matrix is the source of truth for which PRD advances which feature
Parser Detection
| Header contains... | Format | Job source |
|---|---|---|
Safety Test | FAVV v2.1 | H3 heading above table |
Verification | FAVV v2.0 | H3 heading above table |
Feature | FFO (legacy) | Job column in table |
Search order: spec/index.md first, then spec/protocols/index.md fallback. New PRDs use FAVV v2.1. Old PRDs migrate when touched.
Bookend Gates
Every plan has two bookends. They are not optional.
| Bookend | When | Outcome |
|---|---|---|
start-prd-to-specs | Before phase-1 | Every FAVV row has a test that fails |
end-jtbd-validation | After last implementation phase | Every spec passes |
The bookends ARE the definition of done. The start bookend converts stories into machine-verifiable tests (RED). The end bookend proves they pass (GREEN). Everything between is engineering's domain.
Plans with a prdRef must populate prdRef.path pointing to spec/index.md. The ffo-bridge.ts plan-tasks tool converts Build Contract rows into plan tasks — merge with template tasks so PRD-specific work is explicit.
Enforcement Hierarchy
Push enforcement up. Every decision an agent makes is a chance to get it wrong.
| Tier | Mechanism | Guarantee | Failure Mode |
|---|---|---|---|
| 1 | Nx Generators | Code IS correct by construction | None — deterministic |
| 2 | Plan Templates | Phase order + best practice reminders | Agent skips bookend |
| 3 | Rules | Architecture constraints always in context | Agent ignores under load |
| 4 | Skills | Procedural memory for complex workflows | Agent forgets to invoke |
| 5 | Agent Memory | Domain knowledge, judgment calls | Drift, hallucination, forgetting |
The best code is code you don't write. Generators produce correct code by construction. Plan templates remind engineers to reuse helper functions, shared components, and library patterns before writing new code. Spend Tier 5 tokens on edge cases, not boilerplate.
Plan Templates
Plans have two jobs: define the task sequence AND encode best practices. Each template carries institutional memory — which generators to run, which helper libraries to use, which shared components exist. The plan reminds the agent what to reuse so it doesn't reinvent what the factory already built.
When engineering receives a spec, template selection depends on the work:
| PRD Contains | Template | What It Does |
|---|---|---|
| New data entities | data-crud-flow | Schema, repos, actions, verified CRUD |
| Existing entity bugs | corrective-crud-action | Trace root cause through layers, fix, harden |
| UI verification needed | e2e-intent-validation | Playwright specs proving user journeys work |
| Issues to validate | issue-validation-sweep | Batch validation across entities |
| Entity at L1 needing L3 | entity-commissioning | Schema exists, commission through to CRUD |
Gate: If the PRD has no Story Contract, engineering flags this before starting. Stories are the source for Safety Tests — without them, negative testing is guesswork.
What Goes Wrong
| Failure | Symptom | Fix |
|---|---|---|
| No Story Contract | Infrastructure built, admin/user flows missed | Write stories before FAVV rows |
| Function column uses nouns | Parser cannot generate test cases | Use verbs: "Admin invites user" not "Invitations" |
| No Forbidden Outcomes | Safety Test column empty | Every story gets at least one |
| Stories miss a role | Admin governance UI never built | One story per role per critical flow |
| Tier checklist without spec | Implementation tasks, not acceptance criteria | Spec rows are the contract, tiers are build order hints |
| FeatureID missing | Row not tracked in feature-matrix | Every FAVV row links to a RaaS catalog ID |
The Handoff
Dream Team Engineering
| |
| 1. Write spec/index.md |
| (Intent + Stories + FAVV) |
| |
| 2. Signal via agent-comms |
| --channel=meta |
| |
| 3. project-from-prd
| reads spec/index.md
| (prdRef set on project)
| |
| 4. start-prd-to-specs
| failing specs (RED)
| |
| 5. Build: RED → GREEN
| (plans carry prdRef,
| reuse generators +
| helpers from templates)
| |
| 6. end-jtbd-validation
| all specs GREEN
| |
| 7. commissioning-update <----|
| posted to #dream-team |
| (level, entities, evidence) |
| |
| 8. Commission via browser |
| (dream team validates L4) |
| |
| 9. Update feature-matrix.md |
| (advance L-level) |
Context
- PRD Template — Reference implementation of FAVV v2.1 spec
- Feature Matrix — Commissioning status for all platform features
- Commissioning Protocol — L0-L4 maturity model
- PRD Creation — How to spec capabilities
- Priorities — Active PRD table (build order)
Questions
When a Story Contract is missing, does engineering build the wrong thing or build the right thing without safety tests?
- What's the cost of a Forbidden Outcome the story writer never imagined vs one they wrote but engineering skipped?
- If the parser can generate tests from FAVV rows automatically, what role does human judgment play in test design?
- When should engineering push back on a PRD vs start building and flag gaps as they find them?