Skip to main content

Flow Discovery — Worked Examples

Two fully populated examples that show what the Flow Discovery Kick-off actually produces. Read the kick-off first for the method; read this page to see the method applied.

The two examples teach different things on purpose:

  • AEO — teaches the matrix shape. New idea, familiar mental model. Shows how Tier 1 and Tier 2 columns get filled across a workflow with all four agent types.
  • Client Onboarding (Gil case) — teaches the scorecard and the central thesis. Uses the transcript's headline numbers (4.5h → 5min) to show how the scorecard turns a trace into a defensible first-pilot pick.

Both ship together. A reader who skips one loses half the leverage.


Example 1 — AEO (Answer Engine Optimisation)

AEO is to LLM answers (Claude, ChatGPT, Perplexity, Gemini) what SEO is to Google results. It is new enough that most readers have not run the workflow before — so they cannot pretend they already know how it works. Familiar enough to grok in a sentence. High-leverage and current. It cleanly classifies into Real (brand voice, claim accuracy), Artifact (schema markup, publish, monitoring), and Hybrid (question discovery, draft structuring). It naturally produces all four CMSD verbs and all four agent types in one workflow.

Pick the flow

aeo-content--be-cited — content that AI answer engines cite when answering target questions for our ICP.

Workflow Row (Tier 1)

  • JTBD outcome: brand surfaces accurately in AI answer-engine responses to target ICP questions
  • Task count: 6
  • Frequency: per piece, weekly cadence
  • Trigger: Transaction (per brief) + Schedule (weekly batch)
  • HiTL: brand voice, claim verification, gap diagnosis
  • Template: question-driven article (Question → direct Answer → Evidence → adjacent Questions)
  • Lead DE Agents: content strategist, writer, AEO analyst
  • Measure Method (primary): Accuracy — citations are worthless if misattributed
  • QA Agent: editor + cite-tracker
  • KPI: cited-and-accurate answer count per monitored prompt set per week
  • Good: cited in 3+ engines per target query, all claims attributable
  • Bad: not cited, OR cited inaccurately, OR cited but brand misnamed
  • Decision: re-brief on accuracy fail; re-structure on no-citation

Task Rows (Tier 2) — 6 rows, every column filled

Task 1 — Question discovery

  • DE Agent (today): content strategist
  • Agent Type: Collaborator
  • Tool: prompt-monitoring service + ICP interview notes + SEO keyword tool
  • Skill: question framing, ICP empathy
  • Prompt: "Given ICP segment X this quarter, list the questions they would ask Claude / ChatGPT / Perplexity / Gemini that our brand could credibly answer. Tag each by intent (research / compare / buy / troubleshoot)."
  • Template: question list with intent classification and ICP-source attribution
  • QA Agent: content strategist self-review
  • KPI: validated questions per week
  • Good: 10+ ICP-verified questions, each with a named source
  • Bad: speculative list with no ICP source; questions our brand cannot credibly answer
  • Decision when Bad: re-run ICP interview round before proceeding to Task 2

Task 2 — Answer structuring

  • DE Agent (today): writer
  • Agent Type: Collaborator
  • Tool: writing surface + internal research index + brand voice guide
  • Skill: direct-answer prose, evidence citation, brand voice fidelity
  • Prompt: "Answer Q in one paragraph the AI can lift verbatim. Back the answer with three evidence pieces, each with a linked source. Close with two adjacent questions the reader would ask next."
  • Template: Question → direct Answer (≤80 words) → three Evidence pieces (each cited) → adjacent Questions
  • QA Agent: editor + fact-checker
  • KPI: % claims with linked source
  • Good: every claim sourced; direct answer present and lift-able; brand voice intact
  • Bad: hedged answer; uncited claims; voice flat or off-brand
  • Decision when Bad: editor returns to writer for one revision pass; second failure escalates to strategist for re-brief

Task 3 — Citation hygiene

  • DE Agent (today): technical SEO
  • Agent Type: Tasker
  • Tool: CMS + schema.org validator + canonical URL checker
  • Skill: structured data, schema.org Article + FAQPage authoring
  • Prompt: apply schema template per article type — deterministic, no judgment
  • Template: JSON-LD block per article with Article schema (headline, author, datePublished) + FAQPage schema for any Q→A blocks
  • QA Agent: schema validator (automated)
  • KPI: % articles with valid schema on first publish
  • Good: 100% valid schema, all canonical URLs resolve, author entity attributable
  • Bad: missing schema, invalid schema, canonical mis-set
  • Decision when Bad: block publish until validator passes

Task 4 — Publish + ping

  • DE Agent (today): publisher
  • Agent Type: Automator
  • Tool: CMS + sitemap submission + cross-channel notify (RSS, social, newsletter)
  • Skill: distribution sequencing
  • Prompt: publish checklist — deterministic workflow rule
  • Template: publish-state record with timestamps per channel
  • QA Agent: publish verifier (automated)
  • KPI: time-to-first-engine-crawl
  • Good: under 24h to first AI engine crawl across all monitored engines
  • Bad: over 7 days; any channel silently failed
  • Decision when Bad: investigate channel; re-ping; flag chronic failure to engineering

Task 5 — Answer-engine monitoring

  • DE Agent (today): AEO analyst
  • Agent Type: Tasker today, Automator at scale (the first-target nomination)
  • Tool: prompt-test runner across LLM APIs
  • Skill: prompt design, citation parsing, multi-engine result normalisation
  • Prompt: "Run prompt set P weekly across the four engines. For each run, log: was our brand cited, in what position, with what attribution, with what accuracy."
  • Template: citation log row per prompt × engine × date with attribution + accuracy fields
  • QA Agent: strategist reviews accuracy column on weekly digest
  • KPI: cited-and-accurate citation count per week
  • Good: cited in 3+ engines per target query, attribution correct
  • Bad: not cited; cited but brand misnamed; cited inaccurately
  • Decision when Bad: Bad rows feed Task 6; chronic Bad rows escalate to strategist for re-brief

Task 6 — Feedback loop

  • DE Agent (today): content strategist
  • Agent Type: Collaborator
  • Tool: citation log + brief queue + analytics dashboard
  • Skill: gap diagnosis, root-cause inference, re-brief prioritisation
  • Prompt: "Given the citation log for the last 30 days, identify prompts where citation is dropping or accuracy is failing. For each, name the likely cause and the re-brief that would fix it."
  • Template: re-brief queue with prompt, gap, hypothesis, recommended brief
  • QA Agent: editor verifies re-brief quality before it enters production
  • KPI: % re-briefs that move citation count within 30 days
  • Good: ≥60% of re-briefs lift citation count within 30 days
  • Bad: re-brief shipped, no movement; diagnostic was wrong
  • Decision when Bad: retro on the failed re-brief; sharpen the diagnostic logic; update Task 6 Prompt with the new heuristic

CMSD trace of one piece

  • Created — brief (created by strategist on ICP question; per-piece trigger)
  • Manipulated — draft, edit, schema-wrap, publish-format (4 manipulations today, target 2)
  • Shared — published to site, sitemap-pinged to engines, monitored weekly across 4 engines, citation log shared with strategist
  • Deleted — superseded versions archived per evergreen-vs-dated policy

Real / Artifact / Hybrid classification

  • Hybrid: Question discovery (ICP empathy is Real, query enumeration is Artifact); Answer structuring (taste is Real, structure is Artifact); Feedback loop (diagnosis is Real, citation log query is Artifact)
  • Artifact: Citation hygiene; Publish + ping; Monitoring (when at scale)
  • Real: brand voice override (cuts across draft + edit), final accuracy sign-off

Hop count today vs target

  • Today: brief → Google Doc → editor pass → CMS paste → schema plugin → publish → manual citation check across 4 engines → spreadsheet log → re-brief = ~10 hops per piece
  • Target: brief in structured editor (schema baked in) → review → publish (auto-distribute) → automated weekly monitoring → diagnostic queue surfaces gaps → re-brief = ~4 hops
  • Delta: ~6 hops removed

First-target nomination

Task 5 (Answer-engine monitoring). Currently a manual weekly chore consuming senior strategist time; cleanest Artifact win; produces the telemetry that feeds Task 6 and validates the whole workflow's success metric.

Logic-gap list

Task 6 (Feedback loop) currently runs on the strategist's intuition. Before automation, the diagnostic logic must be written down: "if cited count = 0 and prompt is in ICP set, re-brief with X; if cited inaccurately, escalate to Y."

Teaching takeaway

AEO walks the novice through all four verbs, all four agent types, all three measurement axes (Accuracy primary, Timeliness and Friction secondary), the full CMSD trace, the full classification, and the hop-count delta — in one familiar workflow. Once the reader has walked AEO, they can walk their own workflow with the same scaffold.


Example 2 — Client Onboarding (Gil case)

AEO teaches the matrix shape. Client onboarding teaches the scorecard and the central thesis: data quality does not matter, data movement matters. The contract data is already clean; the problem is the relay race. Every services business has this workflow.

Pick the flow

client-onboard--activate — proposal-signed event to onboarded-client state.

The trace (10 hops today)

  1. Proposal signed in e-sign tool → notification fires
  2. Sales rep manually updates CRM record (Created)
  3. Sales rep alerts onboarding coordinator (Gil) via chat (Shared)
  4. Gil pulls contract from e-sign, copies PDF to shared drive (Manipulated)
  5. Gil drafts welcome email referencing contract terms (Manipulated)
  6. Gil schedules kickoff call via calendar tool (Shared)
  7. Gil opens new project in project-management tool, copies fields from CRM (Manipulated — same data, third system)
  8. Gil drafts welcome packet, pulls templates from shared drive (Manipulated)
  9. Gil sends welcome email + packet to client (Shared)
  10. Gil hands off to project manager (Shared) — PM repeats steps 7–8 in their own view

Tools touched per unit: 5–6 (e-sign, CRM, chat, drive, calendar, PM tool).

Classification of the 10 hops

  • Pure Artifact hops: 8 (steps 1, 2, 3, 4, 6, 7, 8, 10 — pure data movement, copy, scheduling, hand-off; no judgment, no brand voice carried)
  • Hybrid hops: 2 (step 5 welcome email draft is Artifact-assembly with Real-voice approval needed before send; step 9 send is the Real moment where Gil's voice and the firm's relationship signal lands on the client)
  • Pure Real hops: 0 today — but a redesign that strips the Artifact assembly out of step 5 surfaces the Real send moment as a clean, isolated human touchpoint

Waste ratio: Artifact-eliminable hops ÷ total hops = (8 + 2 × 0.5) ÷ 10 = 0.9 (Hybrid hops count as half — their assembly half is eliminable; their judgment half stays human).

The simpler shorthand most engagements use: waste ratio ≈ 1.0 because every hop today contains Artifact work, and the Real work that survives the redesign is the part that should survive — it is not waste in the current process either; it is the part being protected.

Scorecard — stated assumption + derived numbers

Stated assumption: average 6 onboardings per month (midpoint of the 4–8 observed range). All annual figures below derive from this number. Change the assumption, re-derive.

  • Volume: 6 onboardings/month → 72/year
  • Time per unit: 4.5h clock time (of which ~1h is Gil's senior capacity)
  • Annual direct hours consumed: 72 × 4.5h = 324h/yr
  • Annual senior hours consumed: 72 × 1h = 72h/yr
  • Senior-time share: 72 ÷ 324 = 22% of the workflow's total touched time
  • Error / rework rate: ~15% (contract terms misread, kickoff missed, packet stale)
  • Hop count: 10
  • Artifact / Hybrid / Real split: 8 / 2 / 0
  • Waste ratio: 0.9 (shorthand ≈ 1.0)
  • Tool count: 6
  • SSOT gap count: 3 (CRM, drive, PM tool each claim authority for client state)
  • Logic status: tacit (lives in Gil's head)
  • Data availability: 100% (every field already exists somewhere)
  • Integration difficulty: low–medium (all major SaaS, all with APIs)
  • Trust / customer-risk: medium (client-facing; first impression)
  • Expected cycle-time delta: 4.5h → 5min per unit (~54× faster)
  • Expected capacity reclaimed: 324h/yr direct + 72h/yr senior — Gil rebriefed to higher-leverage work
  • Confidence: HIGH — owner-attested numbers, trace evidence, every system named

Pilot Fit — running the formula

Pilot Fit = Pain × Waste × Readiness × Confidence × Risk Modifier
= 4.5 × 5.0 × 4.5 × 5.0 × 0.75
= 379

Per-pillar scores against the rubric:

  • Pain 4.5/5 — annual hours 324h (rubric 4), senior share 22% (rubric 4), error rate 15% (rubric 4), per-unit time 4.5h (rubric 5). Average across the populated sub-axes ≈ 4.5.
  • Waste 5/5 — waste ratio 0.9 (rubric 5), hop count 10 (rubric 5), tool count 6 (rubric 5), SSOT gaps 3 (rubric 5). Pillar = 5.
  • Readiness 4.5/5 — logic tacit but interview-able (rubric 4), data 100% available (rubric 5), integrations available (rubric 4), capabilities present (rubric 5). Average ≈ 4.5.
  • Confidence 5/5 — every number owner-attested + trace evidence on the page.
  • Risk Modifier 0.75 — client-facing first impression is medium.

This row dominates almost every other workflow a services business will trace. The formula encodes the rule: not just painful, but waste-heavy, ready, evidenced, and not existentially risky. It is the textbook first pilot.

What disappears in the AI-native redesign

  • Steps 2, 4, 7 — manual data copy between systems (replaced by one-way sync to central store)
  • Step 3 — chat alert (replaced by event trigger)
  • Step 5 — welcome email draft (replaced by templated draft auto-filled from event, human approval pre-send)
  • Step 8 — welcome packet assembly (same)
  • Step 10 — PM rework (eliminated; PM views same record, no re-entry)

What stays human (the Real and Real-half-of-Hybrid)

  • Welcome email send — the Real half of Hybrid step 5/9. Gil approves the auto-drafted email pre-send; the firm's voice and relationship signal still land via Gil, not via an agent
  • Exception handling on novel contract terms — judgment outside the documented logic; routed to Gil or escalation path
  • The kickoff call itself — pure Real; trust transfer between humans
  • First-impression accountability — someone the client knows owns the welcome end-to-end, even if assembly is automated

Target state

4 hops, 1 tool surface (the workflow engine), Gil reclaimed to higher-leverage work, PM not duplicating data entry. The scorecard's Expected Outcome columns become the success metrics in the AI Strategy Meeting Settle phase.

Teaching takeaway

AEO showed how to populate the matrix. Client onboarding shows how the scorecard turns the matrix into a defensible first-pilot pick. Together they cover both halves of the kick-off output: the structured map of the work, and the ranked recommendation that drives the funding decision.


Context

Questions

If you applied the AEO matrix shape to a workflow your team runs weekly — which column would be hardest to fill honestly, and what does that reveal?

  • If you ran the Gil scorecard on your three most painful workflows, would the highest-Pilot-Fit row match the workflow your leadership team would pick by intuition?
  • Which of your workflows has the highest waste ratio — and which Real moments inside it have to be protected when the Artifact assembly is automated?
  • In your business, who is the Gil — the senior specialist quietly absorbing the relay race that should not exist?