Skip to main content

PRD Handoff Protocol

How does a specification become a plan that becomes working software?

Dream Team                              Engineering
| |
| 1. Write spec (Intent+Stories+FAVV) |
| 2. Populate SPEC-MAP rows |
| 3. Signal handoff via comms |
| 4. Parse spec → fitness check
| 4b. BOUNCE (if fitness fails) <-----|
| 5. Write failing tests (RED)
| 6. Build: RED → GREEN
| 7. Validate: all specs GREEN
| 8. Commission via browser <----|
| 9. Update feature-matrix |
| --- REVERSE SIGNAL --- |
| 10. Behavior changes → spec-delta
| 11. Update spec, re-commission <----|

The gap between "what to build" and "how to build it" is where most projects fail. This protocol defines the contract: what the dream team writes, what engineering reads, and how the spec parser bridges them.

The Contract

TeamOwnsAuthoritative SourceShared Format
Dream TeamWhat and WhyPRD creation templatespec/index.md (FAVV v2.1)
EngineeringHowPlan templates (task sequence + reuse)Implementation plan with reference back to spec

Each side has one authoritative source. Dream team follows the PRD creation template to produce specs. Engineering follows plan templates to produce implementations. The spec is the shared contract — engineering never builds from messages, verbal agreements, or tier checklists without stories.

Ownership Boundary

The dream team writes stories that state what must be true — not how to make it true. "Admin can toggle permissions per entity per role, change persists, unauthorized user denied" is a truth test. Engineering decides whether that's a server action, an API endpoint, or a database trigger.

Dream Team DecidesEngineering Decides
User stories and acceptance criteriaImplementation architecture
Forbidden outcomes (safety tests)Which layer handles enforcement
FeatureIDs and scopeTemplate selection and plan structure
Kill date and priority orderSprint sequencing and task dependencies
What "done" looks likeHow to get there

Non-negotiable from both sides: monorepo structure, hexagonal layering, generators for boilerplate, tests before implementation. These are the substrate that makes continuous improvement possible — not implementation choices.

PRD Structure

prd-{name}/
index.md <- Decision surface (10s read)
pictures/ <- Pre-flight maps (thinking instruments)
prompt-deck/ <- Sales compression (2min read)
spec/ <- Engineering spec (30min read)
index.md <- Intent Contract + Story Contract + Build Contract

Spec Anatomy

The spec parser reads three contracts from spec/index.md:

Intent Contract

9 dimensions defining agent autonomy boundaries. Not parsed into tasks — used for judgment when instructions run out.

DimensionEngineering Gets
ObjectiveProblem context for judgment calls
OutcomesObservable state changes to verify
Health MetricsWhat must NOT degrade (Goodhart guard)
ConstraintsHard stops vs steering guidance
AutonomyAllowed / Escalate / Never boundaries
Stop RulesWhen to stop building, when to halt
Counter-metricsNumbers that must stay stable while optimizing
Blast RadiusWhat systems, data, or users this touches
RollbackHow to undo if things go wrong

Story Contract

Stories convert user intent into testable scenarios. Each becomes a Given/When/Then.

ColumnEngineering Reads As
IntentionUser-facing state change — what the user experiences, not what the code does
TriggerObservable event — becomes the test's "Given/When"
Observable SuccessBinary or thresholded — verifiable without the builder present
Forbidden OutcomeWhat must NOT happen — feeds Safety Test in Build Contract
Evidence Typeunit / integration / e2e / eval / replay / monitor
EscalationWhen the agent must stop and ask a human

Every story must have at least one Forbidden Outcome. This is the counterfeit progress detector — it catches work that looks done but isn't safe.

Story rows map 1:many to Build Contract rows. One story may need multiple features. One feature may serve multiple stories.

Build Contract

The deliverable. Every row has an acceptance test. Format: FAVV v2.1.

ColumnEngineering Reads As
FeatureIDLinks to feature-matrix row — the RaaS catalog ID
FunctionFeature name + behavior. Verbs generate test cases. "Browse, search, filter contacts" = 3 tests
ArtifactConcrete deliverable: TypeScript client, PostgreSQL migration, React component
Success TestHappy-path acceptance criteria with thresholds — this IS the e2e test spec
Safety TestWhat must NOT happen — populated from Story Contract Forbidden Outcomes
Regression TestWhat existing capability must NOT degrade — names the specific capability + threshold
ValueBusiness outcome in one sentence
StateEnum: Live, Built, Dormant, Partial, Not verified, Gap, Stub, Broken

Function column rule: Use verbs. "Browse, search, filter contacts" generates 3 test cases. "Contacts" generates a BLOCKER — the parser cannot derive tests from nouns.

Job groupings: H3 headings above FAVV table sections group rows by user job. Each heading includes the FeatureIDs that job advances.

Frozen Scope

A PRD's scope is the set of FeatureIDs in its Build Contract at registration time. That set is frozen.

  • Max 5 distinct FeatureIDs per PRD
  • Adding a new FeatureID after registration requires a new PRD
  • The Feature Matrix is the source of truth for which PRD advances which feature

Parser Detection

Header contains...FormatJob source
Safety TestFAVV v2.1H3 heading above table
VerificationFAVV v2.0H3 heading above table
FeatureFFO (legacy)Job column in table

Search order: spec/index.md first, then spec/protocols/index.md fallback. New PRDs use FAVV v2.1. Old PRDs migrate when touched.

Bookend Gates

Every plan has two bookends. They are not optional.

BookendWhenOutcome
Start: spec-to-testsBefore phase-1Every FAVV row has a test that fails
End: JTBD validationAfter last implementation phaseEvery spec passes

The bookends ARE the definition of done. The start bookend converts stories into machine-verifiable tests (RED). The end bookend proves they pass (GREEN). Everything between is engineering's domain.

Plans reference the spec via a path pointer to spec/index.md. The plan task generator converts Build Contract rows into plan tasks — merge with template tasks so PRD-specific work is explicit.

Fitness Gate

Between spec parsing (step 4) and test writing (step 5), engineering runs an architecture fitness check. This gate can bounce a spec back to the Dream Team before any building starts.

Gate Checks

CheckQuestionBounce if...
Capability overlapDoes an existing module already deliver this?Feature-matrix shows another PRD at L2+ for the same FeatureIDs
Hex boundaryDoes the spec cross hexagonal layer boundaries?A single FAVV row mixes domain logic with infrastructure or presentation
Generator coverageDoes a generator exist for the pattern this spec describes?Spec asks for hand-coded CRUD when a generator produces it correctly
Sibling collisionDoes an in-flight plan already scaffold the same artifact?Active project list shows overlapping components, routes, or server actions

Dream Pre-Check

Before signaling handoff, verify:

  • Read Feature Matrix — confirm no other PRD at L2+ advances the same FeatureIDs
  • Read the Build Contract Function column — confirm each row stays within one hexagonal layer
  • Check if the pattern has an existing generator (entity CRUD, e2e test, UI component) — if so, the spec should reference it, not describe the pattern from scratch

Bounce Protocol

A bounce is not a rejection. It is a return signal. Three pre-build gates can trigger one:

GateFires when...Signal
Fitness checkSpec overlaps existing capability or crosses hex boundariesspec-bounce — which check failed and what to change
Missing storiesNo Story Contract → no Safety Tests → negative testing is guessworkspec-incomplete — which contracts are missing
Noun-only functionsFunction column has nouns ("Contacts") instead of verbs ("Browse, search, filter contacts")spec-incomplete — parser cannot derive tests

Bounce Response

  1. Read the bounce message — it names the specific failure
  2. Update spec/index.md to address the gap
  3. Re-run the pre-handoff checklist (see Create PRD Stories checklist)
  4. Re-signal handoff via comms to #meta

If the bounce changes scope (new FeatureIDs needed), create a new PRD. Original PRD scope stays frozen.

The return signal in Create PRD Stories documents post-build commissioning signals.

Enforcement Tiers

Push enforcement up. Every decision an agent makes is a chance to get it wrong.

TierMechanismGuaranteeFailure Mode
1GeneratorsCode IS correct by constructionNone — deterministic
2Plan TemplatesPhase order + best practice remindersAgent skips bookend
3RulesArchitecture constraints always in contextAgent ignores under load
4SkillsProcedural memory for complex workflowsAgent forgets to invoke
5Agent MemoryDomain knowledge, judgment callsDrift, hallucination, forgetting

The best code is code you don't write. Generators produce correct code by construction. Plan templates remind engineers to reuse helper functions, shared components, and library patterns before writing new code. Spend Tier 5 tokens on edge cases, not boilerplate.

Plan Templates

Plans have two jobs: define the task sequence AND encode best practices. Each template carries institutional memory — which generators to run, which helper libraries to use, which shared components exist. The plan reminds the agent what to reuse so it doesn't reinvent what the factory already built.

When engineering receives a spec, template selection depends on the work:

PRD ContainsTemplateWhat It Does
New data entitiesdata-crud-flowSchema, repos, actions, verified CRUD
Existing entity bugscorrective-crud-actionTrace root cause through layers, fix, harden
UI verification needede2e-intent-validationPlaywright specs proving user journeys work
Issues to validateissue-validation-sweepBatch validation across entities
Entity at L1 needing L3entity-commissioningSchema exists, commission through to CRUD

Gate: If the PRD has no Story Contract, engineering flags this before starting. Stories are the source for Safety Tests — without them, negative testing is guesswork.

Failure Modes

FailureSymptomFix
No Story ContractInfrastructure built, admin/user flows missedWrite stories before FAVV rows
Function column uses nounsParser cannot generate test casesUse verbs: "Admin invites user" not "Invitations"
No Forbidden OutcomesSafety Test column emptyEvery story gets at least one
Stories miss a roleAdmin governance UI never builtOne story per role per critical flow
Tier checklist without specImplementation tasks, not acceptance criteriaSpec rows are the contract, tiers are build order hints
FeatureID missingRow not tracked in feature-matrixEvery FAVV row links to a RaaS catalog ID
Template framing too loosePlan tasks vague, agent builds wrong thingTemplate needs tighter phase gates or generator references

SPEC-MAP

The shared traceability artifact. Both sides write to it. Neither side can claim done with empty cells.

A SPEC-MAP lives in the engineering repo's E2E domain folder. It has one row per Story Contract row.

ColumnWritten ByWhen
Story #DreamAt handoff
WHEN/THENDreamAt handoff
Test FileEngineeringAt spec-to-tests bookend
Test StatusEngineeringAt JTBD validation bookend (RED/GREEN)
L-LevelDreamAt commissioning
Last VerifiedDreamAt commissioning

SPEC-MAP Catches

GapWithout SPEC-MAPWith SPEC-MAP
Feature works but has no testCommissioning scores L4, CI has no regression protectionEmpty Test File cell — visible before commissioning starts
Engineering changes Screen Contract stateDream's spec drifts from realityEngineering updates WHEN/THEN when behavior changes, Dream sees the delta
Story Contract row is untestableEngineering silently skips itTest File cell stays empty — BLOCKER surfaces at validation bookend
Commissioning misses a regressionNext deploy breaks a feature nobody re-checkedLast Verified column shows stale dates — triggers re-commission

SPEC-MAP Rules

  • Every Story Contract row from spec/index.md must appear as a SPEC-MAP row
  • Engineering fills Test File during the start bookend — if a story can't map to a test, that's a spec-bounce, not a skip
  • Test Status updates automatically from CI (GREEN/RED)
  • Dream fills L-Level and Last Verified during commissioning
  • A capability cannot reach L4 with any empty cells in its SPEC-MAP
  • When engineering changes behavior that affects WHEN/THEN, engineering updates the SPEC-MAP row AND posts a spec-delta to #meta

Reverse Signal

When engineering changes implementation in ways that affect the spec:

  1. Engineering updates the SPEC-MAP row with the new behavior
  2. Engineering posts spec-delta via comms to #meta
  3. Message includes: which story row changed, old behavior, new behavior, why
  4. Dream Team updates spec/index.md Story Contract to match reality
  5. Dream re-commissions the affected rows

Without this, specs fossilize. The Screen Contract says "loading skeleton appears" but the component now renders immediately. The spec is wrong. Nobody updates it. The next commissioner reads a spec that describes a product that no longer exists.

Full Sequence

Dream Team                          Engineering
| |
| 1. Write spec/index.md |
| (Intent + Stories + FAVV) |
| |
| 2. Populate SPEC-MAP columns 1-2 |
| (Story #, WHEN/THEN) |
| |
| 3. Signal handoff via comms |
| to #meta |
| |
| 4. Parse spec
| (spec reference set on project)
| |
| 4a. Fitness check
| (overlap, hex boundary,
| generator coverage,
| sibling collision)
| |
| 4b. BOUNCE (if fitness fails) <---|
| spec-bounce via comms |
| Update spec, re-signal |
| |
| 5. Spec-to-tests bookend
| Populate SPEC-MAP column 3
| (Test File paths)
| Failing specs (RED)
| |
| 6. Build: RED → GREEN
| (plans reference spec,
| reuse generators +
| helpers from templates)
| |
| 7. JTBD validation bookend
| All specs GREEN
| Update SPEC-MAP column 4
| (Test Status = GREEN)
| |
| 8. commissioning-update <----|
| posted to #meta |
| (level, entities, evidence) |
| |
| 9. Commission via browser |
| Read SPEC-MAP — verify |
| zero empty cells |
| (dream team validates L4) |
| Update SPEC-MAP columns 5-6 |
| (L-Level, Last Verified) |
| |
| 10. Update feature-matrix |
| (advance L-level) |
| |
| --- REVERSE SIGNAL --- |
| |
| 11. Engineering changes behavior
| Update SPEC-MAP WHEN/THEN
| Post spec-delta to #meta
| |
| 12. Dream updates spec/index.md <--|
| Re-commissions affected rows |

Context

Questions

When a Story Contract is missing, does engineering build the wrong thing or build the right thing without safety tests?

  • What's the cost of a Forbidden Outcome the story writer never imagined vs one they wrote but engineering skipped?
  • If the parser can generate tests from FAVV rows automatically, what role does human judgment play in test design?
  • When should engineering push back on a PRD vs start building and flag gaps as they find them?
  • If a SPEC-MAP has zero empty cells but the market loop (Loop 3) returns no signal, is the problem the spec or the product?
  • When engineering posts a spec-delta that changes a Story Contract's WHEN/THEN, who decides whether the new behavior is better — Dream or Engineering?