Skip to main content

← Multimodal Agent Interface · Prompt Deck · Spec · Plan

How did we arrive at this proposal — and how do we track that value is delivered?

MapQuestionKey Finding
Outcome MapWhat does success look like?50%+ tasks via conversation. 0% today. Kill signal: <10% after 30 days.
Value Stream MapWhere does time die?4-module navigation per task. Copy-paste PDF to form. 1x1 modality out of 49 possible.
Dependency MapWhat must happen first?Agent Platform (identity + memory) and Identity & Access (auth) block everything.

Templates: docs/pictures/patterns/

The Bridge

Pictures sit between sales and engineering. The prompt deck sells the vision in five cards. The spec builds the machine. These three maps prove the thinking is sound.

Prompt Deck (sales)Pictures (bridge)Spec (engineering)
"Forms serve databases, not users"Outcome: 50%+ conversational task rate, 3 jobsT0: Text chat + session memory + feature flag
"50% tasks via conversation"Value Stream: 4-module navigation, 1x1 modalityT1: Modality router + WorkChart adapter + normaliser
"WorkCharts built, wiring missing"Dependency: Agent Platform + Identity block all workT2: File upload + streaming progress + rich results
"Text first, voice second"Outcome: text MVP proves value before adding voiceT3: CRM query adapter + conversation history
"Colleague, not chatbot"Value Stream: broken promises = AI skepticismT4: Voice input (STT)

Context

Questions

What is the most important visual missing from the multimodal agent interface picture set — and why does it matter?

  • Which relationship between elements in this diagram is most underspecified — and what would happen if it were wrong?
  • If this picture were shown to a new engineer on day one, what would they misunderstand — and how should the picture be changed?
  • What assumption does this visual make that should be made explicit in the spec?