Flow Discovery Kick-off
The first session of an AI transformation engagement is not a tech meeting. It is a flow audit.
A ninety-minute working session that converts an owner's vague "we want AI" into a mapped flow with a single unit traced end-to-end, hops counted, work classified. The output sits in front of the AI Strategy Meeting and feeds the Work Charts Matrix.
This page is the high-level method. Depth lives in two companion pages:
- Pilot Selection Scorecard — the rubric and formula for ranking workflows
- Worked Examples — AEO and client onboarding, fully populated
The Inversion
Most engagements start with "clean the data." That answer stalls projects six months while the landscape shifts. The kick-off uses a different premise: map the flow first, clean nothing.
Modern LLMs read messy data well enough that hygiene is rarely the binding constraint. The binding constraints are:
- Logic trapped in one person's head, undocumented and untransferable
- Sprawl — one artifact ricocheting through five or six platforms with no authoritative system
- Coordination — senior specialists spending six of eight hours assembling and chasing data instead of doing the work they were hired for
Flow discovery surfaces these in half a day. Data cleaning surfaces them in six months.
Pick One Flow
Resist mapping the whole business. The kick-off does one thing well: trace one workflow end-to-end.
- High frequency — daily, per-deal, or per-ticket
- Owner-visible pain — cost named in hours, dollars, or missed throughput
- Named defender — one person owns it today and will defend the redesign
- Atomic — "customer onboarding" too broad; "the next inbound enterprise onboarding" right
- JTBD-named —
[domain]-[verb]--[outcome-type](e.g.client-onboard--activate)
That single row is the kick-off scope. Everything else waits.
Intention → Action → Consequence
Every workflow step is a chain of three things. Name all three for every step:
- Intention — what outcome is wanted, in business language
- Action — the verb (created · manipulated · shared · deleted) acting on an artifact, by an owner (human, agent, system)
- Consequence — the state change after the action; measurable, observable, or there is no consequence
Any step missing one of these has its logic in someone's head — and that head will be the bottleneck of the transformation.
The Four-Verb Frame (CMSD)
Parallel to CRUD, interrogated from a flow perspective. For each verb ask who, why, when.
- Created — where does the artifact enter, from what source, in what format
- Manipulated — every transformation after creation; each one is a hop
- Shared — every recipient; each consumer reveals a dependency
- Deleted — retention and end-of-life; the most-ignored verb, source of legal and security risk
A workflow with no Deleted column accumulates risk silently. A workflow with no Manipulated column either is trivial or has its logic in a head.
Real · Artifact · Hybrid
Every step sorts into one of three buckets. This is the lever that decides what AI touches and what AI must not.
- Real — irreducible human work: trust, judgment, taste, brand voice, relationship. Automating these destroys the asset they depend on.
- Artifact — mechanical assembly, retrieval, formatting, routing. The work exists only because the previous step needed a human to touch the output. Every Artifact row is a candidate for replacement.
- Hybrid — assembly is Artifact, judgment is Real. The proposal sign-off is the classic case. Split it — Artifact half to agents and systems, Real half stays human for the judgment moment.
Count the Hops
A hop is any tool-switch, copy, re-entry, or handoff. Each introduces latency, error surface, and provenance loss. The hop count compresses workflow inefficiency into a number the owner can react to.
- Count every hop literally. A copy from CRM to spreadsheet to email is three hops.
- Name the redesign target by hop count. Twenty current, five target = fifteen-hop reduction the room can argue about.
The Two-Tier Matrix
The trace populates a matrix in two tiers.
Tier 1 — Workflow Row (one per chosen flow): JTBD name and outcome · task count · frequency · trigger (Schedule / Transaction) · HiTL summary · template · lead DE Agents · Measure Method · QA Agent · KPI · Good / Bad · Decision
Tier 2 — Task Rows (one per step): Task ID · DE Agent · Tool · Skill · Prompt · Template · QA Agent · KPI · Good / Bad · Decision · Agent Type
Fill Tier 1 first; it sets the outcome the Tier 2 rows must serve. Full column-by-column populated examples live in Worked Examples.
Agent Types — pick one per Task Row
- Taskers — execute a defined task with a defined input. Pattern-match only. Highest-frequency, lowest-novelty work.
- Automators — chain Tasker outputs into deterministic workflows. Own routing logic.
- Collaborators — pair with a human on judgment work. The human leads; the Collaborator drafts and surfaces.
- Orchestrators — coordinate across multiple agents toward a workflow outcome. Own conversation, escalation, exception path.
Mis-typing is a common failure. A Collaborator job assigned to a Tasker produces hallucinated confidence. A Tasker job assigned to a Collaborator burns judgment cycles on mechanics.
Measure Method — pick one primary axis per workflow
- Accuracy — output correctness against the canonical Good. Default for compliance, financial, legal, customer-facing artifacts.
- Timeliness — clock time from trigger to delivery. Default when latency is the constraint.
- Friction — hops, interruptions, context-switches per unit. Default when senior-time-recapture is the goal.
Pick before the build starts. Wrong primary axis = redesign optimising the wrong thing.
The Two Artifacts: Table + Diagram
The kick-off produces two artifacts. Both. Always.
Diagram (pencil and paper first) — draw the flow on shared paper before any tool opens. Every participant sees the same picture; every hop is drawn by a hand the room watches. Tooling comes after — Excalidraw, Miro, clean redraw. Paper costs nothing to redraw when a trace surface surprises you.
Minimum diagram conventions (enough to prevent chaos, not enough to freeze):
- Three owner icons — human, agent, system. Pick one shape language per kick-off and hold it.
- Verb labels on every edge — CMSD verb + the artifact moved. No unlabelled arrows.
- One artifact per handoff — split the edge if two artifacts; otherwise the hop count lies.
- Hop count marked visibly — a number in the corner that grows with the trace.
Table (spreadsheet) — one Tier 1 row + N Tier 2 rows. Sortable (by hop count, AI %, Measure Method), filterable (by Real/Artifact/Hybrid, Agent Type), comparable across workflows. Filled in the room, projected, edited live. Never filled afterward by one person.
A diagram without a table is a poster. A table without a diagram is a database nobody trusts.
The Four Maps the Kick-off Populates
The diagram is not one drawing — it is four maps, drawn in sequence. Each map is already documented as a blueprint elsewhere in /docs/. The kick-off runs them in compressed form. Reusing the four maps activates the flow-engineering toolkit the wider system already speaks.
- 1. Outcome Map — "What does success look like?" Populates the Workflow Row's outcome columns.
- 2. Value Stream Map — "Where's the waste?" Captures Cycle Time, Wait Time, Flow Efficiency. The Real/Artifact/Hybrid hops are Toyota's seven wastes in knowledge-work form.
- 3. Dependency Map — "What must happen first?" Critical path inside the workflow that decides which Task Row to attack first.
- 4. Capability Map — "What can we do?" Maturity 0–5 on the capabilities the redesigned workflow needs.
The next map — Agent and Instrument Diagram — is not the kick-off's deliverable. That belongs to the AI-Native Assessment, after funding is decided. Discovery is cheap; orchestration design is expensive and only happens on workflows the scorecard has ranked.
The chain — visible end-to-end
Flow Discovery Kick-off (this page — 90-min session)
│
├─→ Outcome Map (what success looks like)
├─→ Value Stream Map (where waste hides — Flow Efficiency %)
├─→ Dependency Map (what blocks what — critical path)
├─→ Capability Map (what we can do — maturity 0–5)
│
├─→ Two-Tier Matrix (synthesis: Workflow Row + Task Rows)
└─→ Pilot Scorecard (synthesis: ranked first-pilot pick)
│
▼
AI Strategy Meeting (Owner chairs · decides funding)
│
▼
AI-Native Assessment (clean-sheet redesign)
│
▼
Agent & Instrument (how agents orchestrate the redesign)
│
▼
Build · Commission · Measure · Re-score
The Two-Half Contract
The kick-off has two halves. Naming the split prevents both over-promising the room and under-delivering the analysis.
- In-room session (90 minutes) — minimum viable trace: one workflow, one unit walked, CMSD + classification + hop count + rough pillar scores + first-target nomination. Owner and room produce this together, live, on paper and projected spreadsheet.
- Analyst pass (3 hours within 48h) — tech advisor completes the four maps to gate quality, normalises pillar scores against the scorecard rubric, computes final Pilot Fit, packages outputs for the Strategy Meeting. Owner reviews in a 30-minute sync. Nothing in the analyst pass changes the trace; it formalises it.
In-room is process. Analyst pass is artefact polish. Mixing the two — trying to compute final Pilot Fit while the trace is still drying — is the failure mode the split prevents.
The Kick-off Meeting (90 minutes)
The owner chairs (same rule as the AI Strategy Meeting). The tech advisor facilitates. A note-taker captures into the projected spreadsheet.
Phase 1 — Frame (10 min)
Which workflow today? Atomic, JTBD-named, with a defender in the room. Write the name on the diagram before anyone else speaks.
Phase 2 — Trace + CMSD (45 min)
Walk the one unit end-to-end. For each step: verb (C/M/S/D), owner (human/agent/system), input, output, time, tool. No skipping. No "usually we…". Hop count visible in the corner; update as the trace grows.
Friction in this phase is the deliverable. Steps that "everyone knows" turn out to have three undocumented sub-steps. Senior specialists are discovered doing work the owner thought was automated.
Phase 3 — Classify (15 min)
Sort every step into Real · Artifact · Hybrid. Compute waste ratio (Artifact hops ÷ total hops) on the paper. Tag each Hybrid step with which sub-part is Real (the judgment) vs Artifact (the assembly around it).
Phase 4 — Rough Score + First Target (10 min)
Run the scorecard at rough level: one pillar score per pillar from the room's evidence. Compute rough Pilot Fit. Nominate the first-target task: the highest-waste Artifact-or-Hybrid Task Row on the critical path inside the chosen workflow.
Phase 5 — Settle (10 min)
Set the AI Strategy Meeting date. Name the business-logic-doc backlog (every Task Row where Prompt is "lives in someone's head"). Confirm who runs the analyst pass and by when.
Pre-Meeting Brief
The tech advisor sends three things before the kick-off:
- The unit-of-work request — "Bring one instance of [workflow] you can walk us through start to finish, with the tools open in front of you." Without the live unit, the kick-off becomes theoretical.
- The frame — this page, or a one-pager summary. The owner does not need to be expert in the vocabulary, but the room should not be learning it during the trace.
- The spreadsheet — blank Tier 1 and Tier 2 column sets, ready to project.
Outputs
In-room (90 minutes, paper + projected spreadsheet)
- Trace — one unit walked end-to-end with verb, owner, input, output, time, tool per step
- Classification — every step tagged Real, Artifact, or Hybrid
- Hop count — current state number marked visibly on the diagram
- Rough Pilot Fit — one pillar score per pillar, rough composite
- First-target nomination — highest-waste Artifact-or-Hybrid Task Row on the critical path
- Logic-gap list — every Task Row where Prompt was "lives in someone's head"
Analyst pass (3 hours within 48h)
- Four maps to gate quality — Outcome, Value Stream, Dependency, Capability
- Two-Tier Matrix complete — Workflow Row + N Task Rows, all columns filled
- Normalised scorecard — raw evidence, per-pillar 1-5 against rubric, Pilot Fit
- Pilot Fit ranking — this workflow positioned against any other workflows previously traced
- Strategy-meeting pack — one-paragraph first-target recommendation backed by trace evidence
Critical distinction — Pilot Fit ranks Workflows, not Tasks
Pilot Fit scores a Workflow Row — the entire JTBD as a single bet. The first-target Task Row is then chosen inside the highest-ranked workflow as the highest-waste Artifact-or-Hybrid step on the critical path. Tasks do not carry their own Pain × Waste × Readiness × Confidence × Risk profile. Rank workflows; nominate tasks within the winner.
After the Kick-off
The kick-off does not authorise build. Three more gates:
- AI Strategy Meeting — owner decides which first-target Task Row to fund, with cost in dollars and hours, and a review date.
- AI-Native Assessment — targeted workflow gets a clean-sheet redesign, anchored to nothing. The four maps from the kick-off are its inputs.
- Business Logic Documentation — every logic gap surfaced in the kick-off gets named, owned, written down. Tacit knowledge becomes explicit before an agent is asked to act on it.
The kick-off makes those three gates productive. Without it, the strategy meeting argues about which workflow to attack; the assessment redesigns from anchored assumptions; the build hits the logic gaps mid-implementation.
Anti-Patterns
- Starting with "clean the data." The flow needs mapping, not the data. Discover before you cleanse.
- Abstracting away from the single unit. "Usually we do…" hides failure modes. Stay on the one instance.
- Tech advisor chairing the meeting. Sales pitch for tools. Owner chairs.
- No live unit available. Reschedule until a real instance is in front of the room.
- Skipping the classification. Every step tagged Real, Artifact, or Hybrid — no exceptions.
- Treating hop count as cosmetic. A redesign without a target hop count has no measurable success criterion.
- Mapping the whole business in one session. Pick one flow. Trace it fully. Park the rest.
- Filling the matrix without the maps. Matrix without maps is opinion in a spreadsheet.
- Computing final Pilot Fit in the room. That belongs to the analyst pass. In-room is rough; polished comes after.
- Treating the matrix as a completion artifact. Half-filled with honest gaps beats fully-filled with assumed answers.
Process Over Results
The kick-off produces a matrix, four maps, and a scorecard. None of those are the deliverable on day one. The deliverable is the conversation that produces them.
The first half-dozen kick-offs are themselves the experiment — the team is learning which columns get filled honestly, which need to wait for the analyst pass, which need to be split. Friction during the in-room session is signal, not failure. Capture what blanked the room, what felt awkward, what produced argument that should have been pre-empted by a clearer column. Feed that friction back into the method between sessions. This page documents the current best version. It will change as we learn.
Context
- Pilot Selection Scorecard — Rubric, formula, calibration
- Worked Examples — AEO and client onboarding, fully populated
- AI Strategy Meeting — Decision meeting that consumes the kick-off output
- Work Charts — Matrix that captures Real / Artifact / Hybrid rows
- AI-Native Assessment — Clean-sheet redesign the kick-off feeds
- Outcome Map — Map 1 of 4
- Value Stream Map — Map 2 of 4
- Dependency Map — Map 3 of 4
- Capability Map — Map 4 of 4
- Agent and Instrument Diagram — Next map in chain, produced during AI-Native Assessment
- Process Mapping — General method; this page is the AI-specific specialisation
- Business Process Reengineering — Radical-redesign frame the kick-off enables
Questions
If you cannot name the verb, the owner, and the consequence for every step in a workflow — does the workflow exist as designed, or only as performed by the person who happens to know it?
- Which workflow in your business has the highest hop count — and how many of those hops would survive an honest classification into Real, Artifact, and Hybrid?
- What is your senior specialist actually paid to do — and what percentage of their week is artifact assembly an agent could absorb?
- If you traced one unit of work end-to-end tomorrow, how many steps would have logic that lives only in one person's head — and what happens to your AI roadmap if that person leaves?
- What is the Flow Efficiency of the workflow you would most like to redesign — and if you do not know, does that not tell you something about whether you have actually measured the work?
- What stops you from running the kick-off this week — and is the reason a real constraint or a hygiene-first reflex?