Flow Engineering
How do you turn a picture into a product?
OUTCOME → VALUE STREAM → DEPENDENCIES → CAPABILITIES → A&ID
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
Contracts Processes Sequencing Readiness Orchestration
The same way factories get built — draw it first. P&IDs became steel and concrete. Flow maps become working systems. The drawing IS the engineering.
The Maps
Five maps. Five questions. In sequence. Each produces inputs for the next.
| Map | Question | Produces |
|---|---|---|
| Outcome Map | What does success look like? | Domain contracts, success measures |
| Value Stream Map | Where's the waste? | Use cases, repositories, adapters |
| Dependency Map | What must happen first? | Composition, task ordering |
| Capability Map | What can we do? | Generators, skills, work charts |
| A&ID | How do agents orchestrate? | Agent configs, feedback loops |
4 Key Maps = WHAT to build
A&ID = HOW agents work together to build it
The Capstone
The Agent & Instrument Diagram extends P&ID discipline to AI and Crypto systems.
| Element | Role | Domain |
|---|---|---|
| Agents (AG-XXX) | Actors that take action | Claude, humans, DePIN |
| Instruments (QC/VC/FC) | Sensors that measure | Smart contracts, oracles |
| Feedback Loops | Data improving agents | VVFL, tokenomics, governance |
Products Loop
Flow engineering connects to every dimension of product development:
| Dimension | Connection | How Maps Help |
|---|---|---|
| Jobs To Be Done | Outcome Map IS a job analysis | "What does success look like?" = "What job are we hired for?" |
| AI Products | A&ID IS agent orchestration | Define evals, build loops, measure distributions |
| Product Design | Value Stream maps the design audit | Rendering, visual, responsive, interaction — in sequence |
| Software | Capability Map reveals build vs buy | Core capabilities build, generic capabilities buy |
The Outcome Map starts where JTBD starts — what progress is the customer trying to make? The A&ID ends where AI Products begins — how do agents deliver outcomes in a feedback loop?
Maps to Execution
Maps don't produce documentation. They produce the inputs for plan templates and generators.
| Map | Plan Phase | Generator Input | What It Produces |
|---|---|---|---|
| Outcome Map | Explore | Domain contracts | Ports, DTOs, entities, acceptance criteria |
| Value Stream Map | Define Types | Schema definitions | Repository interfaces, test expectations |
| Dependency Map | Write Test Specs | Ordering constraints | Failing tests that define "done" |
| Capability Map | Build | Generator selection | Scaffolded code in correct layer order |
| A&ID | Orchestrate | Agent configs | Plan templates, feedback loops |
Map the flow → Encode as types → Generate test specs → Scaffold implementation → Validate outcomes
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
Exploration Contracts that Failing tests Generators enforce Did outcomes match
produces the compiler define what correct layer what exploration
contracts enforces "done" means order automatically predicted?
Each map iteration improves the generators. The Capability Map tracks which patterns are codified (generator exists) versus manual (hand-coded). When a manual pattern appears twice, it becomes a generator. When a generator exists, using it is mandatory. The map IS the generator improvement tracker.
Plan templates compose from multiple sources — a single feature plan might derive tasks from entity commissioning, UI component, and e2e testing templates simultaneously. Each template contributes its gates (TDD enforcement, CDD file limits, security triads, proof commands). The plan inherits all gates from all templates. Cross-team task routing (meta, intelligence, UI, platform engineering) emerges from which templates contribute tasks.
Failure Anatomy
plan-cli.ts accepted phase data from stdin. Missing phaseSlug. PostgreSQL caught it as 23502: NOT NULL violation. Debugging took 10+ minutes.
MAP Boundary validation designed? NO — skipped
|
v
TYPE Types used at boundary? NO — `as` cast
|
v
TEST Test for invalid input? NO — none existed
|
v
IMPLEMENT Code trusts stdin blindly? YES
|
v
ERROR Where did it surface? PostgreSQL (most expensive)
Three stages skipped. The error fell through to the most expensive layer — and the one where agents have the least signal to self-correct.
// Before: trusts stdin
const phases = JSON.parse(stdin) as Record<string, unknown>[];
await db.insert(planningPhase).values(phases.map(p => ({ ...p, planId })));
// After: validates at boundary
const phases = phasesInputSchema.parse(JSON.parse(stdin));
This fixes one function. The structural fix prevents the class:
| Level | Fix | Mechanism | Scope |
|---|---|---|---|
| Instance | Add Zod to plan-cli.ts | Import schema, call .parse() | This function |
| Rule | boundary-validation.md in .claude/rules/ | Auto-loaded every session | Every developer, every session |
| Hook | post-edit detecting as casts on JSON.parse | Fires on every .ts edit | Every edit, zero effort |
Enforcement Hierarchy
Three levels of response to any failure:
| Level | Response | Scope |
|---|---|---|
| Instance | Fix the bug | One function, one file |
| Class | Prevent the category | Every file handling external data, every session |
| Structure | Engineer it away | Every edit, zero effort, zero memory required |
Most teams stop at level 1. The question after every incident: at what level did we fix it?
| Tier | Mechanism | Effort | Failure Mode |
|---|---|---|---|
| Generator | Code IS correct by construction | None | Cannot produce wrong pattern |
| Template | Phase ordering prevents skipping | Follow the template | Skip a phase |
| Hook | Auto-fires on edit | None | Developer ignores warning |
| Rule | Auto-loaded context | Read and follow | Developer skims |
| Skill | On-trigger procedure | Invoke the skill | Developer forgets to invoke |
| Expertise | Developer memory | Remember and apply | Developer forgets |
Push enforcement UP. A hook detecting JSON.parse(x) as at edit time prevents the entire class. A memory of "validate stdin" prevents one instance, if you remember.
INCIDENT
|
v
Fix the instance (necessary, not sufficient)
|
v
What CLASS of error? (specific → general)
|
v
Prevent the class (rule: advisory)
|
v
Can this be STRUCTURAL? (advisory → enforcement)
|
v
Engineer the structure (hook/generator: automatic)
|
v
CANNOT RECUR
Each bug can only happen once — because the structure that allowed it is replaced by a structure that prevents it.
Cost of Quality
The enforcement hierarchy describes six tiers. Cost tracking measures whether they work.
Every incident produces a cost annotation:
| Field | What It Records |
|---|---|
| Where caught | Which tier actually caught it (generator / template / hook / rule / skill / expertise) |
| Where it should have been caught | Which tier SHOULD have caught it |
| Time to resolve | Clock time from detection to fix merged |
| Layer | TypeScript / Zod / PostgreSQL / Production |
Three metrics compound from these annotations:
| Metric | What It Measures | Signal |
|---|---|---|
| Catch rate by tier | % of incidents caught at each enforcement level | Hooks catching most = healthy. Expertise catching most = fragile. |
| Escalation rate | % of incidents that fell past their intended tier | Rising = enforcement gaps. Falling = tiers are wired correctly. |
| Cost per miss | Time-to-resolve when an incident escapes its tier | Validates the 10x multiplier from cost escalation |
The connection to the hierarchy:
| Tier | What To Track | Healthy State |
|---|---|---|
| Generator | Incidents in generated code | Zero — if a generator produces bugs, fix the generator |
| Template | Phases skipped or reordered | Zero — template gates should prevent this |
| Hook | Hook fire count vs violations shipped | High fire count, zero violations in commit |
| Rule | Incidents in rule-covered areas | Low — rules without hooks are suggestions under load |
| Skill | Incidents in skill-covered areas where skill wasn't invoked | Decreasing — skill invocation should become habit |
| Expertise | Incidents with no structural prevention | Decreasing — every expertise-caught incident should produce a hook or generator |
The cost tracking loop: incident → annotate → identify tier gap → push enforcement up → measure whether that class recurs.
Two Dimensions
Every map has two layers:
| Layer | What It Captures |
|---|---|
| Dream | Future state — what we're building |
| Engineering | Current state — what exists |
| Gap | What we must build to close the distance |
Fill maps with REALITY (evidence, not hopes). Keep them FRESH (stale maps are worse than no maps).
PLANS ARE WORTHLESS, PLANNING IS ESSENTIAL.
GOOD PLANNING ALWAYS STARTS WITH MAPPING REALITY.
Picture the dream. Map reality. Close the gap.
Context
- Pictures — The tools that make thinking visible
- Type-First Development — What this looks like at the keyboard
- Type-First: Cost Escalation — What happens when you skip a step
- Testing Strategy — Proving each layer honors the contracts
- Products — Great products deliver great outcomes
- Process Optimisation — Improve the flow
- Control System — Enforcement hierarchy maps to PID mechanics
- Flow Engineering (Steve Pereira) — Origin methodology
- Flow Collective — Community of practice