Skip to main content

Flow Engineering

How do you turn a picture into a product?

OUTCOME → VALUE STREAM → DEPENDENCIES → CAPABILITIES → A&ID
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
Contracts Processes Sequencing Readiness Orchestration

The same way factories get built — draw it first. P&IDs became steel and concrete. Flow maps become working systems. The drawing IS the engineering.

The Maps

Five maps. Five questions. In sequence. Each produces inputs for the next.

MapQuestionProduces
Outcome MapWhat does success look like?Domain contracts, success measures
Value Stream MapWhere's the waste?Use cases, repositories, adapters
Dependency MapWhat must happen first?Composition, task ordering
Capability MapWhat can we do?Generators, skills, work charts
A&IDHow do agents orchestrate?Agent configs, feedback loops
4 Key Maps = WHAT to build
A&ID = HOW agents work together to build it

The Capstone

The Agent & Instrument Diagram extends P&ID discipline to AI and Crypto systems.

ElementRoleDomain
Agents (AG-XXX)Actors that take actionClaude, humans, DePIN
Instruments (QC/VC/FC)Sensors that measureSmart contracts, oracles
Feedback LoopsData improving agentsVVFL, tokenomics, governance

Products Loop

Flow engineering connects to every dimension of product development:

DimensionConnectionHow Maps Help
Jobs To Be DoneOutcome Map IS a job analysis"What does success look like?" = "What job are we hired for?"
AI ProductsA&ID IS agent orchestrationDefine evals, build loops, measure distributions
Product DesignValue Stream maps the design auditRendering, visual, responsive, interaction — in sequence
SoftwareCapability Map reveals build vs buyCore capabilities build, generic capabilities buy

The Outcome Map starts where JTBD starts — what progress is the customer trying to make? The A&ID ends where AI Products begins — how do agents deliver outcomes in a feedback loop?

Maps to Execution

Maps don't produce documentation. They produce the inputs for plan templates and generators.

MapPlan PhaseGenerator InputWhat It Produces
Outcome MapExploreDomain contractsPorts, DTOs, entities, acceptance criteria
Value Stream MapDefine TypesSchema definitionsRepository interfaces, test expectations
Dependency MapWrite Test SpecsOrdering constraintsFailing tests that define "done"
Capability MapBuildGenerator selectionScaffolded code in correct layer order
A&IDOrchestrateAgent configsPlan templates, feedback loops
Map the flow → Encode as types → Generate test specs → Scaffold implementation → Validate outcomes
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
Exploration Contracts that Failing tests Generators enforce Did outcomes match
produces the compiler define what correct layer what exploration
contracts enforces "done" means order automatically predicted?

Each map iteration improves the generators. The Capability Map tracks which patterns are codified (generator exists) versus manual (hand-coded). When a manual pattern appears twice, it becomes a generator. When a generator exists, using it is mandatory. The map IS the generator improvement tracker.

Plan templates compose from multiple sources — a single feature plan might derive tasks from entity commissioning, UI component, and e2e testing templates simultaneously. Each template contributes its gates (TDD enforcement, CDD file limits, security triads, proof commands). The plan inherits all gates from all templates. Cross-team task routing (meta, intelligence, UI, platform engineering) emerges from which templates contribute tasks.

Failure Anatomy

plan-cli.ts accepted phase data from stdin. Missing phaseSlug. PostgreSQL caught it as 23502: NOT NULL violation. Debugging took 10+ minutes.

MAP         Boundary validation designed?         NO — skipped
|
v
TYPE Types used at boundary? NO — `as` cast
|
v
TEST Test for invalid input? NO — none existed
|
v
IMPLEMENT Code trusts stdin blindly? YES
|
v
ERROR Where did it surface? PostgreSQL (most expensive)

Three stages skipped. The error fell through to the most expensive layer — and the one where agents have the least signal to self-correct.

// Before: trusts stdin
const phases = JSON.parse(stdin) as Record<string, unknown>[];
await db.insert(planningPhase).values(phases.map(p => ({ ...p, planId })));

// After: validates at boundary
const phases = phasesInputSchema.parse(JSON.parse(stdin));

This fixes one function. The structural fix prevents the class:

LevelFixMechanismScope
InstanceAdd Zod to plan-cli.tsImport schema, call .parse()This function
Ruleboundary-validation.md in .claude/rules/Auto-loaded every sessionEvery developer, every session
Hookpost-edit detecting as casts on JSON.parseFires on every .ts editEvery edit, zero effort

Enforcement Hierarchy

Three levels of response to any failure:

LevelResponseScope
InstanceFix the bugOne function, one file
ClassPrevent the categoryEvery file handling external data, every session
StructureEngineer it awayEvery edit, zero effort, zero memory required

Most teams stop at level 1. The question after every incident: at what level did we fix it?

TierMechanismEffortFailure Mode
GeneratorCode IS correct by constructionNoneCannot produce wrong pattern
TemplatePhase ordering prevents skippingFollow the templateSkip a phase
HookAuto-fires on editNoneDeveloper ignores warning
RuleAuto-loaded contextRead and followDeveloper skims
SkillOn-trigger procedureInvoke the skillDeveloper forgets to invoke
ExpertiseDeveloper memoryRemember and applyDeveloper forgets

Push enforcement UP. A hook detecting JSON.parse(x) as at edit time prevents the entire class. A memory of "validate stdin" prevents one instance, if you remember.

INCIDENT
|
v
Fix the instance (necessary, not sufficient)
|
v
What CLASS of error? (specific → general)
|
v
Prevent the class (rule: advisory)
|
v
Can this be STRUCTURAL? (advisory → enforcement)
|
v
Engineer the structure (hook/generator: automatic)
|
v
CANNOT RECUR

Each bug can only happen once — because the structure that allowed it is replaced by a structure that prevents it.

Cost of Quality

The enforcement hierarchy describes six tiers. Cost tracking measures whether they work.

Every incident produces a cost annotation:

FieldWhat It Records
Where caughtWhich tier actually caught it (generator / template / hook / rule / skill / expertise)
Where it should have been caughtWhich tier SHOULD have caught it
Time to resolveClock time from detection to fix merged
LayerTypeScript / Zod / PostgreSQL / Production

Three metrics compound from these annotations:

MetricWhat It MeasuresSignal
Catch rate by tier% of incidents caught at each enforcement levelHooks catching most = healthy. Expertise catching most = fragile.
Escalation rate% of incidents that fell past their intended tierRising = enforcement gaps. Falling = tiers are wired correctly.
Cost per missTime-to-resolve when an incident escapes its tierValidates the 10x multiplier from cost escalation

The connection to the hierarchy:

TierWhat To TrackHealthy State
GeneratorIncidents in generated codeZero — if a generator produces bugs, fix the generator
TemplatePhases skipped or reorderedZero — template gates should prevent this
HookHook fire count vs violations shippedHigh fire count, zero violations in commit
RuleIncidents in rule-covered areasLow — rules without hooks are suggestions under load
SkillIncidents in skill-covered areas where skill wasn't invokedDecreasing — skill invocation should become habit
ExpertiseIncidents with no structural preventionDecreasing — every expertise-caught incident should produce a hook or generator

The cost tracking loop: incident → annotate → identify tier gap → push enforcement up → measure whether that class recurs.

Two Dimensions

Every map has two layers:

LayerWhat It Captures
DreamFuture state — what we're building
EngineeringCurrent state — what exists
GapWhat we must build to close the distance

Fill maps with REALITY (evidence, not hopes). Keep them FRESH (stale maps are worse than no maps).

PLANS ARE WORTHLESS, PLANNING IS ESSENTIAL.
GOOD PLANNING ALWAYS STARTS WITH MAPPING REALITY.

Picture the dream. Map reality. Close the gap.

Context