PRD Handoff Protocol

How does a specification become a plan that becomes working software?

Dream Team                              Engineering
    |                                        |
    |  1. Write spec (Intent+Stories+FAVV)   |
    |  2. Populate SPEC-MAP rows             |
    |  3. Signal handoff via comms           |
    |                                   4. Parse spec → fitness check
    |  4b. BOUNCE (if fitness fails)   <-----|
    |                                   5. Write failing tests (RED)
    |                                   6. Build: RED → GREEN
    |                                   7. Validate: all specs GREEN
    |  8. Commission via browser        <----|
    |  9. Update feature-matrix              |
    |     --- REVERSE SIGNAL ---             |
    |                                  10. Behavior changes → spec-delta
    | 11. Update spec, re-commission    <----|

The gap between "what to build" and "how to build it" is where most projects fail. This protocol defines the contract: what the dream team writes, what engineering reads, and how the spec parser bridges them.

The Contract

Team	Owns	Authoritative Source	Shared Format
Dream Team	What and Why	PRD creation template	`spec/index.md` (FAVV v2.1)
Engineering	How	Plan templates (task sequence + reuse)	Implementation plan with reference back to spec

Each side has one authoritative source. Dream team follows the PRD creation template to produce specs. Engineering follows plan templates to produce implementations. The spec is the shared contract — engineering never builds from messages, verbal agreements, or tier checklists without stories.

Ownership Boundary

The dream team writes stories that state what must be true — not how to make it true. "Admin can toggle permissions per entity per role, change persists, unauthorized user denied" is a truth test. Engineering decides whether that's a server action, an API endpoint, or a database trigger.

Dream Team Decides	Engineering Decides
User stories and acceptance criteria	Implementation architecture
Forbidden outcomes (safety tests)	Which layer handles enforcement
FeatureIDs and scope	Template selection and plan structure
Kill date and priority order	Sprint sequencing and task dependencies
What "done" looks like	How to get there

Non-negotiable from both sides: monorepo structure, hexagonal layering, generators for boilerplate, tests before implementation. These are the substrate that makes continuous improvement possible — not implementation choices.

PRD Structure

prd-{name}/
 index.md           <- Decision surface (10s read)
 pictures/          <- Pre-flight maps (thinking instruments)
 prompt-deck/       <- Sales compression (2min read)
 spec/              <- Engineering spec (30min read)
   index.md         <- Intent Contract + Story Contract + Build Contract

Spec Anatomy

The spec parser reads three contracts from spec/index.md:

Intent Contract

9 dimensions defining agent autonomy boundaries. Not parsed into tasks — used for judgment when instructions run out.

Dimension	Engineering Gets
Objective	Problem context for judgment calls
Outcomes	Observable state changes to verify
Health Metrics	What must NOT degrade (Goodhart guard)
Constraints	Hard stops vs steering guidance
Autonomy	Allowed / Escalate / Never boundaries
Stop Rules	When to stop building, when to halt
Counter-metrics	Numbers that must stay stable while optimizing
Blast Radius	What systems, data, or users this touches
Rollback	How to undo if things go wrong

Story Contract

Stories convert user intent into testable scenarios. Each becomes a Given/When/Then.

Column	Engineering Reads As
Intention	User-facing state change — what the user experiences, not what the code does
Trigger	Observable event — becomes the test's "Given/When"
Observable Success	Binary or thresholded — verifiable without the builder present
Forbidden Outcome	What must NOT happen — feeds Safety Test in Build Contract
Evidence Type	unit / integration / e2e / eval / replay / monitor
Escalation	When the agent must stop and ask a human

Every story must have at least one Forbidden Outcome. This is the counterfeit progress detector — it catches work that looks done but isn't safe.

Story rows map 1:many to Build Contract rows. One story may need multiple features. One feature may serve multiple stories.

Build Contract

The deliverable. Every row has an acceptance test. Format: FAVV v2.1.

Column	Engineering Reads As
FeatureID	Links to feature-matrix row — the RaaS catalog ID
Function	Feature name + behavior. Verbs generate test cases. "Browse, search, filter contacts" = 3 tests
Artifact	Concrete deliverable: TypeScript client, PostgreSQL migration, React component
Success Test	Happy-path acceptance criteria with thresholds — this IS the e2e test spec
Safety Test	What must NOT happen — populated from Story Contract Forbidden Outcomes
Regression Test	What existing capability must NOT degrade — names the specific capability + threshold
Value	Business outcome in one sentence
State	Enum: Live, Built, Dormant, Partial, Not verified, Gap, Stub, Broken

Function column rule: Use verbs. "Browse, search, filter contacts" generates 3 test cases. "Contacts" generates a BLOCKER — the parser cannot derive tests from nouns.

Job groupings: H3 headings above FAVV table sections group rows by user job. Each heading includes the FeatureIDs that job advances.

Frozen Scope

A PRD's scope is the set of FeatureIDs in its Build Contract at registration time. That set is frozen.

Max 5 distinct FeatureIDs per PRD
Adding a new FeatureID after registration requires a new PRD
The Feature Matrix is the source of truth for which PRD advances which feature

Parser Detection

Header contains...	Format	Job source
`Safety Test`	FAVV v2.1	H3 heading above table
`Verification`	FAVV v2.0	H3 heading above table
`Feature`	FFO (legacy)	Job column in table

Search order: spec/index.md first, then spec/protocols/index.md fallback. New PRDs use FAVV v2.1. Old PRDs migrate when touched.

Bookend Gates

Every plan has two bookends. They are not optional.

Bookend	When	Outcome
Start: spec-to-tests	Before phase-1	Every FAVV row has a test that fails
End: JTBD validation	After last implementation phase	Every spec passes

The bookends ARE the definition of done. The start bookend converts stories into machine-verifiable tests (RED). The end bookend proves they pass (GREEN). Everything between is engineering's domain.

Plans reference the spec via a path pointer to spec/index.md. The plan task generator converts Build Contract rows into plan tasks — merge with template tasks so PRD-specific work is explicit.

Fitness Gate

Between spec parsing (step 4) and test writing (step 5), engineering runs an architecture fitness check. This gate can bounce a spec back to the Dream Team before any building starts.

Gate Checks

Check	Question	Bounce if...
Capability overlap	Does an existing module already deliver this?	Feature-matrix shows another PRD at L2+ for the same FeatureIDs
Hex boundary	Does the spec cross hexagonal layer boundaries?	A single FAVV row mixes domain logic with infrastructure or presentation
Generator coverage	Does a generator exist for the pattern this spec describes?	Spec asks for hand-coded CRUD when a generator produces it correctly
Sibling collision	Does an in-flight plan already scaffold the same artifact?	Active project list shows overlapping components, routes, or server actions

Dream Pre-Check

Before signaling handoff, verify:

Read Feature Matrix — confirm no other PRD at L2+ advances the same FeatureIDs
Read the Build Contract Function column — confirm each row stays within one hexagonal layer
Check if the pattern has an existing generator (entity CRUD, e2e test, UI component) — if so, the spec should reference it, not describe the pattern from scratch

Bounce Protocol

A bounce is not a rejection. It is a return signal. Three pre-build gates can trigger one:

Gate	Fires when...	Signal
Fitness check	Spec overlaps existing capability or crosses hex boundaries	`spec-bounce` — which check failed and what to change
Missing stories	No Story Contract → no Safety Tests → negative testing is guesswork	`spec-incomplete` — which contracts are missing
Noun-only functions	Function column has nouns ("Contacts") instead of verbs ("Browse, search, filter contacts")	`spec-incomplete` — parser cannot derive tests

Bounce Response

Read the bounce message — it names the specific failure
Update spec/index.md to address the gap
Re-run the pre-handoff checklist (see Create PRD Stories checklist)
Re-signal handoff via comms to #meta

If the bounce changes scope (new FeatureIDs needed), create a new PRD. Original PRD scope stays frozen.

The return signal in Create PRD Stories documents post-build commissioning signals.

Enforcement Tiers

Push enforcement up. Every decision an agent makes is a chance to get it wrong.

Tier	Mechanism	Guarantee	Failure Mode
1	Generators	Code IS correct by construction	None — deterministic
2	Plan Templates	Phase order + best practice reminders	Agent skips bookend
3	Rules	Architecture constraints always in context	Agent ignores under load
4	Skills	Procedural memory for complex workflows	Agent forgets to invoke
5	Agent Memory	Domain knowledge, judgment calls	Drift, hallucination, forgetting

The best code is code you don't write. Generators produce correct code by construction. Plan templates remind engineers to reuse helper functions, shared components, and library patterns before writing new code. Spend Tier 5 tokens on edge cases, not boilerplate.

Plan Templates

Plans have two jobs: define the task sequence AND encode best practices. Each template carries institutional memory — which generators to run, which helper libraries to use, which shared components exist. The plan reminds the agent what to reuse so it doesn't reinvent what the factory already built.

When engineering receives a spec, template selection depends on the work:

PRD Contains	Template	What It Does
New data entities	`data-crud-flow`	Schema, repos, actions, verified CRUD
Existing entity bugs	`corrective-crud-action`	Trace root cause through layers, fix, harden
Failing E2E test	`test-failure-investigation`	Walk the pipe, identify correct layer, fix at source
UI verification needed	`e2e-intent-validation`	Playwright specs proving user journeys work
Issues to validate	`issue-validation-sweep`	Batch validation across entities
Entity at L1 needing L3	`entity-commissioning`	Schema exists, commission through to CRUD

Trophy Layer Selection: Every build template must include L1 → L2 → L3 cascade phases. Walk the pipe before writing any L3 test — call the server action directly. If it fails, the fix is at L2, not L3. The E2E Admission Gate prevents wasted E2E effort.

Gate: If the PRD has no Story Contract, engineering flags this before starting. Stories are the source for Safety Tests — without them, negative testing is guesswork.

Failure Modes

Failure	Symptom	Fix
No Story Contract	Infrastructure built, admin/user flows missed	Write stories before FAVV rows
Function column uses nouns	Parser cannot generate test cases	Use verbs: "Admin invites user" not "Invitations"
No Forbidden Outcomes	Safety Test column empty	Every story gets at least one
Stories miss a role	Admin governance UI never built	One story per role per critical flow
Tier checklist without spec	Implementation tasks, not acceptance criteria	Spec rows are the contract, tiers are build order hints
FeatureID missing	Row not tracked in feature-matrix	Every FAVV row links to a RaaS catalog ID
Template framing too loose	Plan tasks vague, agent builds wrong thing	Template needs tighter phase gates or generator references

SPEC-MAP

The shared traceability artifact. Both sides write to it. Neither side can claim done with empty cells.

A SPEC-MAP lives in the engineering repo's E2E domain folder. It has one row per Story Contract row.

Column	Written By	When
Story #	Dream	At handoff
WHEN/THEN	Dream	At handoff
Test Layer	Engineering	At spec-to-tests bookend (L1/L2/L3)
Test File	Engineering	At spec-to-tests bookend
Test Status	Engineering	At JTBD validation bookend (RED/GREEN)
L-Level	Dream	At commissioning
Last Verified	Dream	At commissioning

Test Layer selection: L1 for schema/validation logic, L2 for server action wiring (most stories land here), L3 only when the browser is the proof. Apply the E2E Admission Gate — if you can prove the assertion by calling the server action directly, it's L2, not L3.

SPEC-MAP Catches

Gap	Without SPEC-MAP	With SPEC-MAP
Feature works but has no test	Commissioning scores L4, CI has no regression protection	Empty Test File cell — visible before commissioning starts
Engineering changes Screen Contract state	Dream's spec drifts from reality	Engineering updates WHEN/THEN when behavior changes, Dream sees the delta
Story Contract row is untestable	Engineering silently skips it	Test File cell stays empty — BLOCKER surfaces at validation bookend
Commissioning misses a regression	Next deploy breaks a feature nobody re-checked	Last Verified column shows stale dates — triggers re-commission

SPEC-MAP Rules

Every Story Contract row from spec/index.md must appear as a SPEC-MAP row
Engineering fills Test Layer (L1/L2/L3) and Test File during the start bookend — if a story can't map to a test, that's a spec-bounce, not a skip
Test Status updates automatically from CI (GREEN/RED)
Dream fills L-Level and Last Verified during commissioning
A capability cannot reach L3 with any empty Test Layer cells in its SPEC-MAP
A capability cannot reach L4 with any empty cells in its SPEC-MAP
When engineering changes behavior that affects WHEN/THEN, engineering updates the SPEC-MAP row AND posts a spec-delta to #meta

Reverse Signal

When engineering changes implementation in ways that affect the spec:

Engineering updates the SPEC-MAP row with the new behavior
Engineering posts spec-delta via comms to #meta
Message includes: which story row changed, old behavior, new behavior, why
Dream Team updates spec/index.md Story Contract to match reality
Dream re-commissions the affected rows

Without this, specs fossilize. The Screen Contract says "loading skeleton appears" but the component now renders immediately. The spec is wrong. Nobody updates it. The next commissioner reads a spec that describes a product that no longer exists.

Full Sequence

Dream Team                          Engineering
    |                                    |
    |  1. Write spec/index.md            |
    |     (Intent + Stories + FAVV)       |
    |                                    |
    |  2. Populate SPEC-MAP columns 1-2  |
    |     (Story #, WHEN/THEN)           |
    |                                    |
    |  3. Signal handoff via comms       |
    |     to #meta                       |
    |                                    |
    |                               4. Parse spec
    |                                  (spec reference set on project)
    |                                    |
    |                               4a. Fitness check
    |                                   (overlap, hex boundary,
    |                                    generator coverage,
    |                                    sibling collision)
    |                                    |
    |  4b. BOUNCE (if fitness fails) <---|
    |      spec-bounce via comms         |
    |      Update spec, re-signal        |
    |                                    |
    |                               5. Spec-to-tests bookend
    |                                  Populate SPEC-MAP column 3
    |                                  (Test File paths)
    |                                  Failing specs (RED)
    |                                    |
    |                               6. Build: RED → GREEN
    |                                  (plans reference spec,
    |                                   reuse generators +
    |                                   helpers from templates)
    |                                    |
    |                               7. JTBD validation bookend
    |                                  All specs GREEN
    |                                  Update SPEC-MAP column 4
    |                                  (Test Status = GREEN)
    |                                    |
    |  8. commissioning-update      <----|
    |     posted to #meta                |
    |     (level, entities, evidence)    |
    |                                    |
    |  9. Commission via browser         |
    |     Read SPEC-MAP — verify         |
    |     zero empty cells               |
    |     (dream team validates L4)      |
    |     Update SPEC-MAP columns 5-6    |
    |     (L-Level, Last Verified)       |
    |                                    |
    | 10. Update feature-matrix          |
    |     (advance L-level)              |
    |                                    |
    |     --- REVERSE SIGNAL ---         |
    |                                    |
    |                              11. Engineering changes behavior
    |                                  Update SPEC-MAP WHEN/THEN
    |                                  Post spec-delta to #meta
    |                                    |
    | 12. Dream updates spec/index.md <--|
    |     Re-commissions affected rows   |

Field Lessons

Patterns discovered during builds. Not rules — observations that survived contact with reality.

Multi-Surface Builds

When a Build Contract touches multiple surfaces (CLI, web, MCP), two implementations diverge unless they share a single application layer.

Pattern	What Happened	Story Nudge
Factory-first	A use-case factory unified four surfaces. Without it, CLI and server actions diverged silently	If Build Contract rows touch 2+ surfaces, add FORBIDDEN: "implementation diverges between surfaces"
Schema-first	Defining validation schemas before handlers unlocked describe, MCP, and validation in one pass	If Function column includes "parse", "validate", or "accept input", name the schema as the artifact
Delegation	Server actions called repositories directly, bypassing the use case layer. Two codepaths, invisible drift	Every "change persists" story should include FORBIDDEN: "mutation bypasses application layer"

Build Discipline

Pattern	What Happened	Story Nudge
Bookend skipped	Start bookend (failing tests) was treated as optional under time pressure. Tests written after implementation missed design errors	RED tests are the definition of "ready to build." If they don't exist, building hasn't started
Empty SPEC-MAP	Features shipped with empty Test File cells. Commissioning passed on visual inspection alone	Empty cells are blockers. A feature without a test file is a feature without evidence
Orphaned composition	A composition file existed for weeks with zero importers. Nobody noticed	If a Build Contract row creates shared infrastructure, a story should verify something consumes it

Protocol Reminders

Five nudges the protocol expects engineering to surface at key moments. Dream defines WHAT the reminder says and WHEN it appears. Engineering decides HOW.

These are nudges, not blocks. The goal is a friendly prompt in the right direction.

Nudge	When	Reminder
SPEC-MAP completeness	Before PR merge	"N empty Test File cells. Stories without tests are wishes, not contracts."
Reverse signal	Test assertion changes	"SPEC-MAP WHEN/THEN still match? If not, post spec-delta to #meta."
Bookend start	Plan first task begins	"Every FAVV row needs a failing test before Phase 1. Run spec-to-tests."
Factory pattern	Build rows touch 2+ surfaces	"Multiple surfaces for one capability. Consider a use-case factory to prevent divergence."
Counterfeit bridge	Writing Safety Tests	"Safety Test should match Story Contract counterfeit verbatim. If it doesn't, the bridge is broken."

Reminder Tiers

Engineering can implement reminders at any enforcement tier. Higher tiers catch more.

Tier	Mechanism	Example
2	Plan template	Task description includes the reminder text
3	Rule file	Always-loaded context nudges the agent
4	Skill step	Procedural gate checks the condition
5	Hook	Automated check on commit or PR

The best reminder is unnecessary because the generator produces correct code by construction (Tier 1). Until then, Tier 2-5 reminders reduce the chance of silent drift.

Context

PRD Template — Reference implementation of FAVV v2.1 spec
Feature Matrix — Commissioning status for all platform features
Commissioning Protocol — L0-L4 maturity model
PRD Creation — How to spec capabilities, including Return Signals
Priorities — Active PRD table (build order)
Credibility — Three loops: inner (engineering), story (predictions), market (external validation). Market is the greatest force
Development Pipeline — The full PAIN → COMMISSION chain that the SPEC-MAP traces
Protocols — This IS a coordination protocol between two repos
Verifiable Intent — SPEC-MAP is a non-cryptographic VI chain: spec=L2 intent, build=L3 action, commission=verification
Meetings — Handoff is a Decision meeting between Dream and Engineering

Questions

When a Story Contract is missing, does engineering build the wrong thing or build the right thing without safety tests?

What's the cost of a Forbidden Outcome the story writer never imagined vs one they wrote but engineering skipped?
If the parser can generate tests from FAVV rows automatically, what role does human judgment play in test design?
When should engineering push back on a PRD vs start building and flag gaps as they find them?
If a SPEC-MAP has zero empty cells but the market loop (Loop 3) returns no signal, is the problem the spec or the product?
When engineering posts a spec-delta that changes a Story Contract's WHEN/THEN, who decides whether the new behavior is better — Dream or Engineering?
When a field lesson contradicts the original protocol, which wins — the lesson or the rule it amends?

The Contract​

Ownership Boundary​

PRD Structure​

Spec Anatomy​

Intent Contract​

Story Contract​

Build Contract​

Frozen Scope​

Parser Detection​

Bookend Gates​

Fitness Gate​

Gate Checks​

Dream Pre-Check​

Bounce Protocol​

Bounce Response​

Enforcement Tiers​

Plan Templates​

Failure Modes​

SPEC-MAP​

SPEC-MAP Catches​

SPEC-MAP Rules​

Reverse Signal​

Full Sequence​

Field Lessons​

Multi-Surface Builds​

Build Discipline​

Protocol Reminders​

Reminder Tiers​

Context​

Questions​