Flow Discovery — Worked Examples

Two fully populated examples that show what the Flow Discovery Kick-off actually produces. Read the kick-off first for the method; read this page to see the method applied.

The two examples teach different things on purpose:

AEO — teaches the matrix shape. New idea, familiar mental model. Shows how Tier 1 and Tier 2 columns get filled across a workflow with all four agent types.
Client Onboarding (Gil case) — teaches the scorecard and the central thesis. Uses the transcript's headline numbers (4.5h → 5min) to show how the scorecard turns a trace into a defensible first-pilot pick.

Both ship together. A reader who skips one loses half the leverage.

Example 1 — AEO (Answer Engine Optimisation)

AEO is to LLM answers (Claude, ChatGPT, Perplexity, Gemini) what SEO is to Google results. It is new enough that most readers have not run the workflow before — so they cannot pretend they already know how it works. Familiar enough to grok in a sentence. High-leverage and current. It cleanly classifies into Real (brand voice, claim accuracy), Artifact (schema markup, publish, monitoring), and Hybrid (question discovery, draft structuring). It naturally produces all four CMSD verbs and all four agent types in one workflow.

Pick the flow

aeo-content--be-cited — content that AI answer engines cite when answering target questions for our ICP.

Workflow Row (Tier 1)

JTBD outcome: brand surfaces accurately in AI answer-engine responses to target ICP questions
Task count: 6
Frequency: per piece, weekly cadence
Trigger: Transaction (per brief) + Schedule (weekly batch)
HiTL: brand voice, claim verification, gap diagnosis
Template: question-driven article (Question → direct Answer → Evidence → adjacent Questions)
Lead DE Agents: content strategist, writer, AEO analyst
Measure Method (primary): Accuracy — citations are worthless if misattributed
QA Agent: editor + cite-tracker
KPI: cited-and-accurate answer count per monitored prompt set per week
Good: cited in 3+ engines per target query, all claims attributable
Bad: not cited, OR cited inaccurately, OR cited but brand misnamed
Decision: re-brief on accuracy fail; re-structure on no-citation

Task Rows (Tier 2) — 6 rows, every column filled

Task 1 — Question discovery

DE Agent (today): content strategist
Agent Type: Collaborator
Tool: prompt-monitoring service + ICP interview notes + SEO keyword tool
Skill: question framing, ICP empathy
Prompt: "Given ICP segment X this quarter, list the questions they would ask Claude / ChatGPT / Perplexity / Gemini that our brand could credibly answer. Tag each by intent (research / compare / buy / troubleshoot)."
Template: question list with intent classification and ICP-source attribution
QA Agent: content strategist self-review
KPI: validated questions per week
Good: 10+ ICP-verified questions, each with a named source
Bad: speculative list with no ICP source; questions our brand cannot credibly answer
Decision when Bad: re-run ICP interview round before proceeding to Task 2

Task 2 — Answer structuring

DE Agent (today): writer
Agent Type: Collaborator
Tool: writing surface + internal research index + brand voice guide
Skill: direct-answer prose, evidence citation, brand voice fidelity
Prompt: "Answer Q in one paragraph the AI can lift verbatim. Back the answer with three evidence pieces, each with a linked source. Close with two adjacent questions the reader would ask next."
Template: Question → direct Answer (≤80 words) → three Evidence pieces (each cited) → adjacent Questions
QA Agent: editor + fact-checker
KPI: % claims with linked source
Good: every claim sourced; direct answer present and lift-able; brand voice intact
Bad: hedged answer; uncited claims; voice flat or off-brand
Decision when Bad: editor returns to writer for one revision pass; second failure escalates to strategist for re-brief

Task 3 — Citation hygiene

DE Agent (today): technical SEO
Agent Type: Tasker
Tool: CMS + schema.org validator + canonical URL checker
Skill: structured data, schema.org Article + FAQPage authoring
Prompt: apply schema template per article type — deterministic, no judgment
Template: JSON-LD block per article with Article schema (headline, author, datePublished) + FAQPage schema for any Q→A blocks
QA Agent: schema validator (automated)
KPI: % articles with valid schema on first publish
Good: 100% valid schema, all canonical URLs resolve, author entity attributable
Bad: missing schema, invalid schema, canonical mis-set
Decision when Bad: block publish until validator passes

Task 4 — Publish + ping

DE Agent (today): publisher
Agent Type: Automator
Tool: CMS + sitemap submission + cross-channel notify (RSS, social, newsletter)
Skill: distribution sequencing
Prompt: publish checklist — deterministic workflow rule
Template: publish-state record with timestamps per channel
QA Agent: publish verifier (automated)
KPI: time-to-first-engine-crawl
Good: under 24h to first AI engine crawl across all monitored engines
Bad: over 7 days; any channel silently failed
Decision when Bad: investigate channel; re-ping; flag chronic failure to engineering

Task 5 — Answer-engine monitoring

DE Agent (today): AEO analyst
Agent Type: Tasker today, Automator at scale (the first-target nomination)
Tool: prompt-test runner across LLM APIs
Skill: prompt design, citation parsing, multi-engine result normalisation
Prompt: "Run prompt set P weekly across the four engines. For each run, log: was our brand cited, in what position, with what attribution, with what accuracy."
Template: citation log row per prompt × engine × date with attribution + accuracy fields
QA Agent: strategist reviews accuracy column on weekly digest
KPI: cited-and-accurate citation count per week
Good: cited in 3+ engines per target query, attribution correct
Bad: not cited; cited but brand misnamed; cited inaccurately
Decision when Bad: Bad rows feed Task 6; chronic Bad rows escalate to strategist for re-brief

Task 6 — Feedback loop

DE Agent (today): content strategist
Agent Type: Collaborator
Tool: citation log + brief queue + analytics dashboard
Skill: gap diagnosis, root-cause inference, re-brief prioritisation
Prompt: "Given the citation log for the last 30 days, identify prompts where citation is dropping or accuracy is failing. For each, name the likely cause and the re-brief that would fix it."
Template: re-brief queue with prompt, gap, hypothesis, recommended brief
QA Agent: editor verifies re-brief quality before it enters production
KPI: % re-briefs that move citation count within 30 days
Good: ≥60% of re-briefs lift citation count within 30 days
Bad: re-brief shipped, no movement; diagnostic was wrong
Decision when Bad: retro on the failed re-brief; sharpen the diagnostic logic; update Task 6 Prompt with the new heuristic

CMSD trace of one piece

Created — brief (created by strategist on ICP question; per-piece trigger)
Manipulated — draft, edit, schema-wrap, publish-format (4 manipulations today, target 2)
Shared — published to site, sitemap-pinged to engines, monitored weekly across 4 engines, citation log shared with strategist
Deleted — superseded versions archived per evergreen-vs-dated policy

Real / Artifact / Hybrid classification

Hybrid: Question discovery (ICP empathy is Real, query enumeration is Artifact); Answer structuring (taste is Real, structure is Artifact); Feedback loop (diagnosis is Real, citation log query is Artifact)
Artifact: Citation hygiene; Publish + ping; Monitoring (when at scale)
Real: brand voice override (cuts across draft + edit), final accuracy sign-off

Hop count today vs target

Today: brief → Google Doc → editor pass → CMS paste → schema plugin → publish → manual citation check across 4 engines → spreadsheet log → re-brief = ~10 hops per piece
Target: brief in structured editor (schema baked in) → review → publish (auto-distribute) → automated weekly monitoring → diagnostic queue surfaces gaps → re-brief = ~4 hops
Delta: ~6 hops removed

First-target nomination

Task 5 (Answer-engine monitoring). Currently a manual weekly chore consuming senior strategist time; cleanest Artifact win; produces the telemetry that feeds Task 6 and validates the whole workflow's success metric.

Logic-gap list

Task 6 (Feedback loop) currently runs on the strategist's intuition. Before automation, the diagnostic logic must be written down: "if cited count = 0 and prompt is in ICP set, re-brief with X; if cited inaccurately, escalate to Y."

Teaching takeaway

AEO walks the novice through all four verbs, all four agent types, all three measurement axes (Accuracy primary, Timeliness and Friction secondary), the full CMSD trace, the full classification, and the hop-count delta — in one familiar workflow. Once the reader has walked AEO, they can walk their own workflow with the same scaffold.

Example 2 — Client Onboarding (Gil case)

AEO teaches the matrix shape. Client onboarding teaches the scorecard and the central thesis: data quality does not matter, data movement matters. The contract data is already clean; the problem is the relay race. Every services business has this workflow.

Pick the flow

client-onboard--activate — proposal-signed event to onboarded-client state.

The trace (10 hops today)

Proposal signed in e-sign tool → notification fires
Sales rep manually updates CRM record (Created)
Sales rep alerts onboarding coordinator (Gil) via chat (Shared)
Gil pulls contract from e-sign, copies PDF to shared drive (Manipulated)
Gil drafts welcome email referencing contract terms (Manipulated)
Gil schedules kickoff call via calendar tool (Shared)
Gil opens new project in project-management tool, copies fields from CRM (Manipulated — same data, third system)
Gil drafts welcome packet, pulls templates from shared drive (Manipulated)
Gil sends welcome email + packet to client (Shared)
Gil hands off to project manager (Shared) — PM repeats steps 7–8 in their own view

Tools touched per unit: 5–6 (e-sign, CRM, chat, drive, calendar, PM tool).

Classification of the 10 hops

Pure Artifact hops: 8 (steps 1, 2, 3, 4, 6, 7, 8, 10 — pure data movement, copy, scheduling, hand-off; no judgment, no brand voice carried)
Hybrid hops: 2 (step 5 welcome email draft is Artifact-assembly with Real-voice approval needed before send; step 9 send is the Real moment where Gil's voice and the firm's relationship signal lands on the client)
Pure Real hops: 0 today — but a redesign that strips the Artifact assembly out of step 5 surfaces the Real send moment as a clean, isolated human touchpoint

Waste ratio: Artifact-eliminable hops ÷ total hops = (8 + 2 × 0.5) ÷ 10 = 0.9 (Hybrid hops count as half — their assembly half is eliminable; their judgment half stays human).

The simpler shorthand most engagements use: waste ratio ≈ 1.0 because every hop today contains Artifact work, and the Real work that survives the redesign is the part that should survive — it is not waste in the current process either; it is the part being protected.

Scorecard — stated assumption + derived numbers

Stated assumption: average 6 onboardings per month (midpoint of the 4–8 observed range). All annual figures below derive from this number. Change the assumption, re-derive.

Volume: 6 onboardings/month → 72/year
Time per unit: 4.5h clock time (of which ~1h is Gil's senior capacity)
Annual direct hours consumed: 72 × 4.5h = 324h/yr
Annual senior hours consumed: 72 × 1h = 72h/yr
Senior-time share: 72 ÷ 324 = 22% of the workflow's total touched time
Error / rework rate: ~15% (contract terms misread, kickoff missed, packet stale)
Hop count: 10
Artifact / Hybrid / Real split: 8 / 2 / 0
Waste ratio: 0.9 (shorthand ≈ 1.0)
Tool count: 6
SSOT gap count: 3 (CRM, drive, PM tool each claim authority for client state)
Logic status: tacit (lives in Gil's head)
Data availability: 100% (every field already exists somewhere)
Integration difficulty: low–medium (all major SaaS, all with APIs)
Trust / customer-risk: medium (client-facing; first impression)
Expected cycle-time delta: 4.5h → 5min per unit (~54× faster)
Expected capacity reclaimed: 324h/yr direct + 72h/yr senior — Gil rebriefed to higher-leverage work
Confidence: HIGH — owner-attested numbers, trace evidence, every system named

Pilot Fit — running the formula

Pilot Fit = Pain × Waste × Readiness × Confidence × Risk Modifier
         = 4.5  × 5.0   × 4.5       × 5.0        × 0.75
         = 379

Per-pillar scores against the rubric:

Pain 4.5/5 — annual hours 324h (rubric 4), senior share 22% (rubric 4), error rate 15% (rubric 4), per-unit time 4.5h (rubric 5). Average across the populated sub-axes ≈ 4.5.
Waste 5/5 — waste ratio 0.9 (rubric 5), hop count 10 (rubric 5), tool count 6 (rubric 5), SSOT gaps 3 (rubric 5). Pillar = 5.
Readiness 4.5/5 — logic tacit but interview-able (rubric 4), data 100% available (rubric 5), integrations available (rubric 4), capabilities present (rubric 5). Average ≈ 4.5.
Confidence 5/5 — every number owner-attested + trace evidence on the page.
Risk Modifier 0.75 — client-facing first impression is medium.

This row dominates almost every other workflow a services business will trace. The formula encodes the rule: not just painful, but waste-heavy, ready, evidenced, and not existentially risky. It is the textbook first pilot.

What disappears in the AI-native redesign

Steps 2, 4, 7 — manual data copy between systems (replaced by one-way sync to central store)
Step 3 — chat alert (replaced by event trigger)
Step 5 — welcome email draft (replaced by templated draft auto-filled from event, human approval pre-send)
Step 8 — welcome packet assembly (same)
Step 10 — PM rework (eliminated; PM views same record, no re-entry)

What stays human (the Real and Real-half-of-Hybrid)

Welcome email send — the Real half of Hybrid step 5/9. Gil approves the auto-drafted email pre-send; the firm's voice and relationship signal still land via Gil, not via an agent
Exception handling on novel contract terms — judgment outside the documented logic; routed to Gil or escalation path
The kickoff call itself — pure Real; trust transfer between humans
First-impression accountability — someone the client knows owns the welcome end-to-end, even if assembly is automated

Target state

4 hops, 1 tool surface (the workflow engine), Gil reclaimed to higher-leverage work, PM not duplicating data entry. The scorecard's Expected Outcome columns become the success metrics in the AI Strategy Meeting Settle phase.

Teaching takeaway

AEO showed how to populate the matrix. Client onboarding shows how the scorecard turns the matrix into a defensible first-pilot pick. Together they cover both halves of the kick-off output: the structured map of the work, and the ranked recommendation that drives the funding decision.

Context

Flow Discovery Kick-off — The method these examples apply
Pilot Selection Scorecard — The rubric the Gil example scores against
AI Strategy Meeting — Where these examples land as funded pilots (or not)
Work Charts — Matrix where Tier 1 and Tier 2 rows live
AI-Native Assessment — Where the redesign happens after the AI Strategy Meeting funds it

Questions

If you applied the AEO matrix shape to a workflow your team runs weekly — which column would be hardest to fill honestly, and what does that reveal?

If you ran the Gil scorecard on your three most painful workflows, would the highest-Pilot-Fit row match the workflow your leadership team would pick by intuition?
Which of your workflows has the highest waste ratio — and which Real moments inside it have to be protected when the Artifact assembly is automated?
In your business, who is the Gil — the senior specialist quietly absorbing the relay race that should not exist?

Example 1 — AEO (Answer Engine Optimisation)​

Pick the flow​

Workflow Row (Tier 1)​

Task Rows (Tier 2) — 6 rows, every column filled​

Task 1 — Question discovery​

Task 2 — Answer structuring​

Task 3 — Citation hygiene​

Task 4 — Publish + ping​

Task 5 — Answer-engine monitoring​

Task 6 — Feedback loop​

CMSD trace of one piece​

Real / Artifact / Hybrid classification​

Hop count today vs target​

First-target nomination​

Logic-gap list​

Teaching takeaway​

Example 2 — Client Onboarding (Gil case)​

Pick the flow​

The trace (10 hops today)​

Classification of the 10 hops​

Scorecard — stated assumption + derived numbers​

Pilot Fit — running the formula​

What disappears in the AI-native redesign​

What stays human (the Real and Real-half-of-Hybrid)​

Target state​

Teaching takeaway​

Context​

Questions​

Example 1 — AEO (Answer Engine Optimisation)

Pick the flow

Workflow Row (Tier 1)

Task Rows (Tier 2) — 6 rows, every column filled

Task 1 — Question discovery

Task 2 — Answer structuring

Task 3 — Citation hygiene

Task 4 — Publish + ping

Task 5 — Answer-engine monitoring

Task 6 — Feedback loop

CMSD trace of one piece

Real / Artifact / Hybrid classification

Hop count today vs target

First-target nomination

Logic-gap list

Teaching takeaway

Example 2 — Client Onboarding (Gil case)

Pick the flow

The trace (10 hops today)

Classification of the 10 hops

Scorecard — stated assumption + derived numbers

Pilot Fit — running the formula

What disappears in the AI-native redesign

What stays human (the Real and Real-half-of-Hybrid)

Target state

Teaching takeaway

Context

Questions