Skip to main content

PRD Handoff Protocol

How does a specification become a plan that becomes working software?

The gap between "what to build" and "how to build it" is where most projects fail. This protocol defines the contract: what the dream team writes, what engineering reads, and how the parser bridges them.

The Contract

TeamOwnsBibleFormat
Dream TeamWhat and Whytemplate.json (PRD creation workflow)spec/index.md (FAVV v2.1)
EngineeringHowTemplate Plans (task sequence + reuse)plan.json with prdRef back to spec

Each side has one authoritative resource. Dream team follows template.json to produce specs. Engineering follows template plans to produce implementations. The spec is the shared contract — engineering never builds from Slack messages, verbal agreements, or tier checklists without stories.

Ownership Boundary

The dream team writes stories that state what must be true — not how to make it true. "Admin can toggle permissions per entity per role, change persists, unauthorized user denied" is a truth test. Engineering decides whether that's a server action, an API endpoint, or a database trigger.

Dream Team DecidesEngineering Decides
User stories and acceptance criteriaImplementation architecture
Forbidden outcomes (safety tests)Which layer handles enforcement
FeatureIDs and scopeTemplate selection and plan structure
Kill date and priority orderSprint sequencing and task dependencies
What "done" looks likeHow to get there

Non-negotiable from both sides: Nx monorepo structure, hexagonal layering, generators for boilerplate, tests before implementation. These are the substrate that makes continuous improvement possible — not implementation choices.

PRD Structure

prd-{name}/
index.md <- Decision surface (10s read)
pictures/ <- Pre-flight maps (thinking instruments)
prompt-deck/ <- Sales compression (2min read)
spec/ <- Engineering spec (30min read)
index.md <- Intent Contract + Story Contract + Build Contract

Spec Anatomy

The project-from-prd parser reads three contracts from spec/index.md:

Intent Contract

9 dimensions defining agent autonomy boundaries. Not parsed into tasks — used for judgment when instructions run out.

DimensionEngineering Gets
ObjectiveProblem context for judgment calls
OutcomesObservable state changes to verify
Health MetricsWhat must NOT degrade (Goodhart guard)
ConstraintsHard stops vs steering guidance
AutonomyAllowed / Escalate / Never boundaries
Stop RulesWhen to stop building, when to halt
Counter-metricsNumbers that must stay stable while optimizing
Blast RadiusWhat systems, data, or users this touches
RollbackHow to undo if things go wrong

Story Contract

Stories convert user intent into testable scenarios. Each becomes a Given/When/Then.

ColumnEngineering Reads As
IntentionUser-facing state change — what the user experiences, not what the code does
TriggerObservable event — becomes the test's "Given/When"
Observable SuccessBinary or thresholded — verifiable without the builder present
Forbidden OutcomeWhat must NOT happen — feeds Safety Test in Build Contract
Evidence Typeunit / integration / e2e / eval / replay / monitor
EscalationWhen the agent must stop and ask a human

Every story must have at least one Forbidden Outcome. This is the counterfeit progress detector — it catches work that looks done but isn't safe.

Story rows map 1:many to Build Contract rows. One story may need multiple features. One feature may serve multiple stories.

Build Contract (FAVV v2.1)

The deliverable. Every row has an acceptance test.

ColumnEngineering Reads As
FeatureIDLinks to feature-matrix row — the RaaS catalog ID
FunctionFeature name + behavior. Verbs generate test cases. "Browse, search, filter contacts" = 3 tests
ArtifactConcrete deliverable: TypeScript client, PostgreSQL migration, React component
Success TestHappy-path acceptance criteria with thresholds — this IS the e2e test spec
Safety TestWhat must NOT happen — populated from Story Contract Forbidden Outcomes
Regression TestWhat existing capability must NOT degrade — names the specific capability + threshold
ValueBusiness outcome in one sentence
StateEnum: Live, Built, Dormant, Partial, Not verified, Gap, Stub, Broken

Function column rule: Use verbs. "Browse, search, filter contacts" generates 3 test cases. "Contacts" generates a BLOCKER — the parser cannot derive tests from nouns.

Job groupings: H3 headings above FAVV table sections group rows by user job. Each heading includes the FeatureIDs that job advances.

Frozen Scope

A PRD's scope is the set of FeatureIDs in its Build Contract at registration time. That set is frozen.

  • Max 5 distinct FeatureIDs per PRD
  • Adding a new FeatureID after registration requires a new PRD
  • The Feature Matrix is the source of truth for which PRD advances which feature

Parser Detection

Header contains...FormatJob source
Safety TestFAVV v2.1H3 heading above table
VerificationFAVV v2.0H3 heading above table
FeatureFFO (legacy)Job column in table

Search order: spec/index.md first, then spec/protocols/index.md fallback. New PRDs use FAVV v2.1. Old PRDs migrate when touched.

Bookend Gates

Every plan has two bookends. They are not optional.

BookendWhenOutcome
start-prd-to-specsBefore phase-1Every FAVV row has a test that fails
end-jtbd-validationAfter last implementation phaseEvery spec passes

The bookends ARE the definition of done. The start bookend converts stories into machine-verifiable tests (RED). The end bookend proves they pass (GREEN). Everything between is engineering's domain.

Plans with a prdRef must populate prdRef.path pointing to spec/index.md. The ffo-bridge.ts plan-tasks tool converts Build Contract rows into plan tasks — merge with template tasks so PRD-specific work is explicit.

Enforcement Hierarchy

Push enforcement up. Every decision an agent makes is a chance to get it wrong.

TierMechanismGuaranteeFailure Mode
1Nx GeneratorsCode IS correct by constructionNone — deterministic
2Plan TemplatesPhase order + best practice remindersAgent skips bookend
3RulesArchitecture constraints always in contextAgent ignores under load
4SkillsProcedural memory for complex workflowsAgent forgets to invoke
5Agent MemoryDomain knowledge, judgment callsDrift, hallucination, forgetting

The best code is code you don't write. Generators produce correct code by construction. Plan templates remind engineers to reuse helper functions, shared components, and library patterns before writing new code. Spend Tier 5 tokens on edge cases, not boilerplate.

Plan Templates

Plans have two jobs: define the task sequence AND encode best practices. Each template carries institutional memory — which generators to run, which helper libraries to use, which shared components exist. The plan reminds the agent what to reuse so it doesn't reinvent what the factory already built.

When engineering receives a spec, template selection depends on the work:

PRD ContainsTemplateWhat It Does
New data entitiesdata-crud-flowSchema, repos, actions, verified CRUD
Existing entity bugscorrective-crud-actionTrace root cause through layers, fix, harden
UI verification needede2e-intent-validationPlaywright specs proving user journeys work
Issues to validateissue-validation-sweepBatch validation across entities
Entity at L1 needing L3entity-commissioningSchema exists, commission through to CRUD

Gate: If the PRD has no Story Contract, engineering flags this before starting. Stories are the source for Safety Tests — without them, negative testing is guesswork.

What Goes Wrong

FailureSymptomFix
No Story ContractInfrastructure built, admin/user flows missedWrite stories before FAVV rows
Function column uses nounsParser cannot generate test casesUse verbs: "Admin invites user" not "Invitations"
No Forbidden OutcomesSafety Test column emptyEvery story gets at least one
Stories miss a roleAdmin governance UI never builtOne story per role per critical flow
Tier checklist without specImplementation tasks, not acceptance criteriaSpec rows are the contract, tiers are build order hints
FeatureID missingRow not tracked in feature-matrixEvery FAVV row links to a RaaS catalog ID

The Handoff

Dream Team                          Engineering
| |
| 1. Write spec/index.md |
| (Intent + Stories + FAVV) |
| |
| 2. Signal via agent-comms |
| --channel=meta |
| |
| 3. project-from-prd
| reads spec/index.md
| (prdRef set on project)
| |
| 4. start-prd-to-specs
| failing specs (RED)
| |
| 5. Build: RED → GREEN
| (plans carry prdRef,
| reuse generators +
| helpers from templates)
| |
| 6. end-jtbd-validation
| all specs GREEN
| |
| 7. commissioning-update <----|
| posted to #dream-team |
| (level, entities, evidence) |
| |
| 8. Commission via browser |
| (dream team validates L4) |
| |
| 9. Update feature-matrix.md |
| (advance L-level) |

Context

Questions

When a Story Contract is missing, does engineering build the wrong thing or build the right thing without safety tests?

  • What's the cost of a Forbidden Outcome the story writer never imagined vs one they wrote but engineering skipped?
  • If the parser can generate tests from FAVV rows automatically, what role does human judgment play in test design?
  • When should engineering push back on a PRD vs start building and flag gaps as they find them?