Flow Discovery Kick-off

The first session of an AI transformation engagement is not a tech meeting. It is a flow audit.

A ninety-minute working session that converts an owner's vague "we want AI" into a mapped flow with a single unit traced end-to-end, hops counted, work classified. The output sits in front of the AI Strategy Meeting and feeds the Work Charts Matrix.

This page is the high-level method. Depth lives in two companion pages:

Pilot Selection Scorecard — the rubric and formula for ranking workflows
Worked Examples — AEO and client onboarding, fully populated

The Inversion

Most engagements start with "clean the data." That answer stalls projects six months while the landscape shifts. The kick-off uses a different premise: map the flow first, clean nothing.

Modern LLMs read messy data well enough that hygiene is rarely the binding constraint. The binding constraints are:

Logic trapped in one person's head, undocumented and untransferable
Sprawl — one artifact ricocheting through five or six platforms with no authoritative system
Coordination — senior specialists spending six of eight hours assembling and chasing data instead of doing the work they were hired for

Flow discovery surfaces these in half a day. Data cleaning surfaces them in six months.

Pick One Flow

Resist mapping the whole business. The kick-off does one thing well: trace one workflow end-to-end.

High frequency — daily, per-deal, or per-ticket
Owner-visible pain — cost named in hours, dollars, or missed throughput
Named defender — one person owns it today and will defend the redesign
Atomic — "customer onboarding" too broad; "the next inbound enterprise onboarding" right
JTBD-named — [domain]-[verb]--[outcome-type] (e.g. client-onboard--activate)

That single row is the kick-off scope. Everything else waits.

Intention → Action → Consequence

Every workflow step is a chain of three things. Name all three for every step:

Intention — what outcome is wanted, in business language
Action — the verb (created · manipulated · shared · deleted) acting on an artifact, by an owner (human, agent, system)
Consequence — the state change after the action; measurable, observable, or there is no consequence

Any step missing one of these has its logic in someone's head — and that head will be the bottleneck of the transformation.

The Four-Verb Frame (CMSD)

Parallel to CRUD, interrogated from a flow perspective. For each verb ask who, why, when.

Created — where does the artifact enter, from what source, in what format
Manipulated — every transformation after creation; each one is a hop
Shared — every recipient; each consumer reveals a dependency
Deleted — retention and end-of-life; the most-ignored verb, source of legal and security risk

A workflow with no Deleted column accumulates risk silently. A workflow with no Manipulated column either is trivial or has its logic in a head.

Real · Artifact · Hybrid

Every step sorts into one of three buckets. This is the lever that decides what AI touches and what AI must not.

Real — irreducible human work: trust, judgment, taste, brand voice, relationship. Automating these destroys the asset they depend on.
Artifact — mechanical assembly, retrieval, formatting, routing. The work exists only because the previous step needed a human to touch the output. Every Artifact row is a candidate for replacement.
Hybrid — assembly is Artifact, judgment is Real. The proposal sign-off is the classic case. Split it — Artifact half to agents and systems, Real half stays human for the judgment moment.

Count the Hops

A hop is any tool-switch, copy, re-entry, or handoff. Each introduces latency, error surface, and provenance loss. The hop count compresses workflow inefficiency into a number the owner can react to.

Count every hop literally. A copy from CRM to spreadsheet to email is three hops.
Name the redesign target by hop count. Twenty current, five target = fifteen-hop reduction the room can argue about.

The Two-Tier Matrix

The trace populates a matrix in two tiers.

Tier 1 — Workflow Row (one per chosen flow): JTBD name and outcome · task count · frequency · trigger (Schedule / Transaction) · HiTL summary · template · lead DE Agents · Measure Method · QA Agent · KPI · Good / Bad · Decision

Tier 2 — Task Rows (one per step): Task ID · DE Agent · Tool · Skill · Prompt · Template · QA Agent · KPI · Good / Bad · Decision · Agent Type

Fill Tier 1 first; it sets the outcome the Tier 2 rows must serve. Full column-by-column populated examples live in Worked Examples.

Agent Types — pick one per Task Row

Taskers — execute a defined task with a defined input. Pattern-match only. Highest-frequency, lowest-novelty work.
Automators — chain Tasker outputs into deterministic workflows. Own routing logic.
Collaborators — pair with a human on judgment work. The human leads; the Collaborator drafts and surfaces.
Orchestrators — coordinate across multiple agents toward a workflow outcome. Own conversation, escalation, exception path.

Mis-typing is a common failure. A Collaborator job assigned to a Tasker produces hallucinated confidence. A Tasker job assigned to a Collaborator burns judgment cycles on mechanics.

Measure Method — pick one primary axis per workflow

Accuracy — output correctness against the canonical Good. Default for compliance, financial, legal, customer-facing artifacts.
Timeliness — clock time from trigger to delivery. Default when latency is the constraint.
Friction — hops, interruptions, context-switches per unit. Default when senior-time-recapture is the goal.

Pick before the build starts. Wrong primary axis = redesign optimising the wrong thing.

The Two Artifacts: Table + Diagram

The kick-off produces two artifacts. Both. Always.

Diagram (pencil and paper first) — draw the flow on shared paper before any tool opens. Every participant sees the same picture; every hop is drawn by a hand the room watches. Tooling comes after — Excalidraw, Miro, clean redraw. Paper costs nothing to redraw when a trace surface surprises you.

Minimum diagram conventions (enough to prevent chaos, not enough to freeze):

Three owner icons — human, agent, system. Pick one shape language per kick-off and hold it.
Verb labels on every edge — CMSD verb + the artifact moved. No unlabelled arrows.
One artifact per handoff — split the edge if two artifacts; otherwise the hop count lies.
Hop count marked visibly — a number in the corner that grows with the trace.

Table (spreadsheet) — one Tier 1 row + N Tier 2 rows. Sortable (by hop count, AI %, Measure Method), filterable (by Real/Artifact/Hybrid, Agent Type), comparable across workflows. Filled in the room, projected, edited live. Never filled afterward by one person.

A diagram without a table is a poster. A table without a diagram is a database nobody trusts.

The Four Maps the Kick-off Populates

The diagram is not one drawing — it is four maps, drawn in sequence. Each map is already documented as a blueprint elsewhere in /docs/. The kick-off runs them in compressed form. Reusing the four maps activates the flow-engineering toolkit the wider system already speaks.

1. Outcome Map — "What does success look like?" Populates the Workflow Row's outcome columns.
2. Value Stream Map — "Where's the waste?" Captures Cycle Time, Wait Time, Flow Efficiency. The Real/Artifact/Hybrid hops are Toyota's seven wastes in knowledge-work form.
3. Dependency Map — "What must happen first?" Critical path inside the workflow that decides which Task Row to attack first.
4. Capability Map — "What can we do?" Maturity 0–5 on the capabilities the redesigned workflow needs.

The next map — Agent and Instrument Diagram — is not the kick-off's deliverable. That belongs to the AI-Native Assessment, after funding is decided. Discovery is cheap; orchestration design is expensive and only happens on workflows the scorecard has ranked.

The chain — visible end-to-end

Flow Discovery Kick-off   (this page — 90-min session)
  │
  ├─→ Outcome Map         (what success looks like)
  ├─→ Value Stream Map    (where waste hides — Flow Efficiency %)
  ├─→ Dependency Map      (what blocks what — critical path)
  ├─→ Capability Map      (what we can do — maturity 0–5)
  │
  ├─→ Two-Tier Matrix     (synthesis: Workflow Row + Task Rows)
  └─→ Pilot Scorecard     (synthesis: ranked first-pilot pick)
        │
        ▼
  AI Strategy Meeting     (Owner chairs · decides funding)
        │
        ▼
  AI-Native Assessment    (clean-sheet redesign)
        │
        ▼
  Agent & Instrument      (how agents orchestrate the redesign)
        │
        ▼
  Build · Commission · Measure · Re-score

The Two-Half Contract

The kick-off has two halves. Naming the split prevents both over-promising the room and under-delivering the analysis.

In-room session (90 minutes) — minimum viable trace: one workflow, one unit walked, CMSD + classification + hop count + rough pillar scores + first-target nomination. Owner and room produce this together, live, on paper and projected spreadsheet.
Analyst pass (3 hours within 48h) — tech advisor completes the four maps to gate quality, normalises pillar scores against the scorecard rubric, computes final Pilot Fit, packages outputs for the Strategy Meeting. Owner reviews in a 30-minute sync. Nothing in the analyst pass changes the trace; it formalises it.

In-room is process. Analyst pass is artefact polish. Mixing the two — trying to compute final Pilot Fit while the trace is still drying — is the failure mode the split prevents.

The Kick-off Meeting (90 minutes)

The owner chairs (same rule as the AI Strategy Meeting). The tech advisor facilitates. A note-taker captures into the projected spreadsheet.

Phase 1 — Frame (10 min)

Which workflow today? Atomic, JTBD-named, with a defender in the room. Write the name on the diagram before anyone else speaks.

Phase 2 — Trace + CMSD (45 min)

Walk the one unit end-to-end. For each step: verb (C/M/S/D), owner (human/agent/system), input, output, time, tool. No skipping. No "usually we…". Hop count visible in the corner; update as the trace grows.

Friction in this phase is the deliverable. Steps that "everyone knows" turn out to have three undocumented sub-steps. Senior specialists are discovered doing work the owner thought was automated.

Phase 3 — Classify (15 min)

Sort every step into Real · Artifact · Hybrid. Compute waste ratio (Artifact hops ÷ total hops) on the paper. Tag each Hybrid step with which sub-part is Real (the judgment) vs Artifact (the assembly around it).

Phase 4 — Rough Score + First Target (10 min)

Run the scorecard at rough level: one pillar score per pillar from the room's evidence. Compute rough Pilot Fit. Nominate the first-target task: the highest-waste Artifact-or-Hybrid Task Row on the critical path inside the chosen workflow.

Phase 5 — Settle (10 min)

Set the AI Strategy Meeting date. Name the business-logic-doc backlog (every Task Row where Prompt is "lives in someone's head"). Confirm who runs the analyst pass and by when.

Pre-Meeting Brief

The tech advisor sends three things before the kick-off:

The unit-of-work request — "Bring one instance of [workflow] you can walk us through start to finish, with the tools open in front of you." Without the live unit, the kick-off becomes theoretical.
The frame — this page, or a one-pager summary. The owner does not need to be expert in the vocabulary, but the room should not be learning it during the trace.
The spreadsheet — blank Tier 1 and Tier 2 column sets, ready to project.

Outputs

In-room (90 minutes, paper + projected spreadsheet)

Trace — one unit walked end-to-end with verb, owner, input, output, time, tool per step
Classification — every step tagged Real, Artifact, or Hybrid
Hop count — current state number marked visibly on the diagram
Rough Pilot Fit — one pillar score per pillar, rough composite
First-target nomination — highest-waste Artifact-or-Hybrid Task Row on the critical path
Logic-gap list — every Task Row where Prompt was "lives in someone's head"

Analyst pass (3 hours within 48h)

Four maps to gate quality — Outcome, Value Stream, Dependency, Capability
Two-Tier Matrix complete — Workflow Row + N Task Rows, all columns filled
Normalised scorecard — raw evidence, per-pillar 1-5 against rubric, Pilot Fit
Pilot Fit ranking — this workflow positioned against any other workflows previously traced
Strategy-meeting pack — one-paragraph first-target recommendation backed by trace evidence

Critical distinction — Pilot Fit ranks Workflows, not Tasks

Pilot Fit scores a Workflow Row — the entire JTBD as a single bet. The first-target Task Row is then chosen inside the highest-ranked workflow as the highest-waste Artifact-or-Hybrid step on the critical path. Tasks do not carry their own Pain × Waste × Readiness × Confidence × Risk profile. Rank workflows; nominate tasks within the winner.

After the Kick-off

The kick-off does not authorise build. Three more gates:

AI Strategy Meeting — owner decides which first-target Task Row to fund, with cost in dollars and hours, and a review date.
AI-Native Assessment — targeted workflow gets a clean-sheet redesign, anchored to nothing. The four maps from the kick-off are its inputs.
Business Logic Documentation — every logic gap surfaced in the kick-off gets named, owned, written down. Tacit knowledge becomes explicit before an agent is asked to act on it.

The kick-off makes those three gates productive. Without it, the strategy meeting argues about which workflow to attack; the assessment redesigns from anchored assumptions; the build hits the logic gaps mid-implementation.

Anti-Patterns

Starting with "clean the data." The flow needs mapping, not the data. Discover before you cleanse.
Abstracting away from the single unit. "Usually we do…" hides failure modes. Stay on the one instance.
Tech advisor chairing the meeting. Sales pitch for tools. Owner chairs.
No live unit available. Reschedule until a real instance is in front of the room.
Skipping the classification. Every step tagged Real, Artifact, or Hybrid — no exceptions.
Treating hop count as cosmetic. A redesign without a target hop count has no measurable success criterion.
Mapping the whole business in one session. Pick one flow. Trace it fully. Park the rest.
Filling the matrix without the maps. Matrix without maps is opinion in a spreadsheet.
Computing final Pilot Fit in the room. That belongs to the analyst pass. In-room is rough; polished comes after.
Treating the matrix as a completion artifact. Half-filled with honest gaps beats fully-filled with assumed answers.

Process Over Results

The kick-off produces a matrix, four maps, and a scorecard. None of those are the deliverable on day one. The deliverable is the conversation that produces them.

The first half-dozen kick-offs are themselves the experiment — the team is learning which columns get filled honestly, which need to wait for the analyst pass, which need to be split. Friction during the in-room session is signal, not failure. Capture what blanked the room, what felt awkward, what produced argument that should have been pre-empted by a clearer column. Feed that friction back into the method between sessions. This page documents the current best version. It will change as we learn.

Context

Pilot Selection Scorecard — Rubric, formula, calibration
Worked Examples — AEO and client onboarding, fully populated
AI Strategy Meeting — Decision meeting that consumes the kick-off output
Work Charts — Matrix that captures Real / Artifact / Hybrid rows
AI-Native Assessment — Clean-sheet redesign the kick-off feeds
Outcome Map — Map 1 of 4
Value Stream Map — Map 2 of 4
Dependency Map — Map 3 of 4
Capability Map — Map 4 of 4
Agent and Instrument Diagram — Next map in chain, produced during AI-Native Assessment
Process Mapping — General method; this page is the AI-specific specialisation
Business Process Reengineering — Radical-redesign frame the kick-off enables

Questions

If you cannot name the verb, the owner, and the consequence for every step in a workflow — does the workflow exist as designed, or only as performed by the person who happens to know it?

Which workflow in your business has the highest hop count — and how many of those hops would survive an honest classification into Real, Artifact, and Hybrid?
What is your senior specialist actually paid to do — and what percentage of their week is artifact assembly an agent could absorb?
If you traced one unit of work end-to-end tomorrow, how many steps would have logic that lives only in one person's head — and what happens to your AI roadmap if that person leaves?
What is the Flow Efficiency of the workflow you would most like to redesign — and if you do not know, does that not tell you something about whether you have actually measured the work?
What stops you from running the kick-off this week — and is the reason a real constraint or a hygiene-first reflex?

The Inversion​

Pick One Flow​

Intention → Action → Consequence​

The Four-Verb Frame (CMSD)​

Real · Artifact · Hybrid​

Count the Hops​

The Two-Tier Matrix​

Agent Types — pick one per Task Row​

Measure Method — pick one primary axis per workflow​

The Two Artifacts: Table + Diagram​

The Four Maps the Kick-off Populates​

The chain — visible end-to-end​

The Two-Half Contract​

The Kick-off Meeting (90 minutes)​

Phase 1 — Frame (10 min)​

Phase 2 — Trace + CMSD (45 min)​

Phase 3 — Classify (15 min)​

Phase 4 — Rough Score + First Target (10 min)​

Phase 5 — Settle (10 min)​

Pre-Meeting Brief​

Outputs​

In-room (90 minutes, paper + projected spreadsheet)​

Analyst pass (3 hours within 48h)​

Critical distinction — Pilot Fit ranks Workflows, not Tasks​

After the Kick-off​

Anti-Patterns​

Process Over Results​

Context​

Questions​