L0product

Protocol Coverage

When deploying agent commerce, every protocol interface must be provably correct — coverage maps protocols to contracts to tests so trust is computed, not claimed.

1,500

Priority Score

Pain × Demand × Edge × Trend × Conversion

Customer Journey

Why should I care?

Five cards that sell the dream

1Why

Trust is computed, not claimed.

When an agent spends money on your behalf, what proves it stayed within bounds?

The friction: 49 protocols documented. Only 20 tested. The Commerce Authorization Chain has gaps at every stage after USER AUTH. Trust exists in documentation, not in evidence.

The desire: Every protocol interface provably correct. Coverage computed from test results, not hand-counted from markdown. The chain verified end-to-end.

The proof: 20 protocols already pass with Zod contracts. The pattern works. 29 more need the same treatment.

The 11 gaps that matter most

Picture

A control room with 49 protocol indicators — 20 green, 29 dark. The Commerce Authorization Chain glowing as a red thread through the center. Cinematic, 16:9

2Evidence

41% tested. 59% unknown.

Would you trust a bridge where 59% of the welds were uninspected?

Domain	Coverage
Agent Communication	63% (5/8)
Verifiable Intent	0% (0/6)
Payment Execution	17% (1/6)
Identity	40% (2/5)
On-Chain Trust	0% (0/4)

The maps that reveal the gap

Picture

A dashboard showing five domain bars — two partially filled in green, three empty with red outlines. Numbers prominent. Dark background. Cinematic, 16:9

3Platform

Contract-first, three tiers.

What if every test started with the shape of success before writing a line of code?

The pattern: Zod schema defines the contract. Test validates the contract against the real service. Trophy layer (L1-L3) assigned automatically. Coverage computed from results.

Tier	Tests	Entry point
Orchestration	Direct import	IntentTestClient
Protocol	HTTP/JSON-RPC	authenticatedFetch
Network	Remote discovery	/.well-known/

The value stream from spec to trust

Picture

Three stacked layers — Orchestration (in-process), Protocol (HTTP), Network (remote) — each with Zod schemas connecting them. Dark stage. Cinematic, 16:9

4Edge

Commerce Authorization Chain.

Can you trace one transaction from human intent to on-chain proof?

Every agent-to-agent transaction requires this chain to be provably correct. Three links are broken. The chain is only as strong as its weakest.

Stage	What it proves	State
USER AUTH	Who authorized	Partial
INTENT	What they said	Missing
ACTION	What agent did	Partial
SETTLEMENT	Value moved	Missing
AUDIT	Proof exists	Missing

The dependency map

Picture

Five chain links — USER AUTH, INTENT, ACTION, SETTLEMENT, AUDIT — three links broken, two intact. Red thread connecting them. Cinematic, 16:9

5Metric

80% coverage or kill.

What percentage of trust would you accept as 'good enough'?

North star: Protocol coverage from 41% to 80% within 60 days. If it stalls at 60%, the approach is too slow.

Measure	Current	Target
Protocol coverage	41%	> 80%
Commerce chain stages	2/5	5/5
Coverage computation	Manual	Automated

The kill signal and thresholds

Picture

A gauge showing 41% in amber, with an 80% threshold line in green. Below the gauge, a countdown timer. Dark panel. Cinematic, 16:9

1 / 5

Same five positions. Different seat.

The customer sees trust as a guarantee. The builder sees trust as a test suite. The Commerce Authorization Chain is both the product promise and the engineering spec.

Feature Dev Journey

How does this get built?

Five cards that sell the process

1Job

Agents transact. Prove it works.

Can you state the struggling moment in one sentence?

Situation: 49 protocols mapped. 20 tested. Commerce Authorization Chain has untested links. Trust is claimed in docs, not computed from evidence.

Intention: Every protocol has a Zod contract and a passing test spec. Coverage computed and surfaced. Chain verified end-to-end.

Obstacle: 29 missing tests. 10 missing contracts. External deps (FIDO, Sui testnet) can't be mocked.

Intent contract

Picture

A single card illuminated on a dark stage — the text reads 'When agents transact, every protocol must be provably correct' in crimson. Cinematic, 16:9

2Stories

11 gaps across 5 domains.

Which gap would cause the most damage if it failed silently?

Group	Stories	Critical
Can agents communicate?	S1-S3	MCP has zero tests
Can agents pay and settle?	S4-S7	VI has zero tests
Can we prove coverage?	S8-S11	No computation

Full story contracts

Picture

A test dashboard — 11 story IDs, each with a red indicator and domain label. Dark background, terminal aesthetic. Cinematic, 16:9

3Build

Contract, spec, trophy, coverage.

What's already built that you haven't wired yet?

Exists	Build
20 passing protocol tests	29 more contracts + specs
3-tier test architecture	External infra (FIDO, Sui)
Zod contract-first pattern	Coverage computation script
8 protocol docs in Dream repo	Bridge docs to test results

Platform state

Picture

Four pipeline stages — Contract, Spec, Trophy, Coverage — with 20 items flowing through and 29 queued. Cinematic, 16:9

4Measure

PROTOCOL-COVERAGE.md as truth.

If coverage isn't computed, how do you know it's real?

The coverage map is the decision surface. Today it's hand-counted. Tomorrow it's computed from test results. The script IS the measurement.

Today	Target
Open PROTOCOL-COVERAGE.md	Run coverage script
Count rows manually	Computed per domain
Summary may not match tables	Single source of truth

Measurement stories

Picture

A terminal showing computed coverage percentages per domain, with a diff showing changes. Green numbers replacing red. Cinematic, 16:9

5Loop

Each test proves one chain link.

What's the feedback loop between a test passing and trust increasing?

The loop: write contract → write spec → spec fails (RED) → implement → spec passes (GREEN) → trophy assigned → coverage % increases → chain gets stronger.

Legacy rule: Each completed protocol test improves the contract pattern for the next domain. The testing template gets smarter with each pass.

The loop that compounds trust

Picture

A chain with five links — each test that passes lights up one link. The chain getting stronger with each green indicator. Cinematic, 16:9

1 / 5

The pitch names the gap. The flow diagrams prove the dependency. The VV stories validate each link.

Flow Diagrams VV Stories

Problem

Situation

49 protocols mapped across 5 domains. 20 tested (41%). Commerce Authorization Chain has untested links at every stage after USER AUTH. Verifiable Intent has zero tests. On-Chain Trust has zero tests. Dream repo documents 8 agent protocols but no bridge to engineering verification results. Trust is claimed in documentation, not computed from test evidence.

Intention

Every protocol has a Zod contract and a passing test spec. Coverage percentage computed from test results and surfaced in the feature matrix. Commerce Authorization Chain verified end-to-end: USER AUTH → INTENT CAPTURE → AGENT ACTION → SETTLEMENT → AUDIT.

Obstacle

29 protocols lack tests. 10 have no contracts at all. Three distinct test tiers (Orchestration, Protocol, Network) need different infrastructure. Verifiable Intent requires FIDO keys. On-Chain Trust requires Sui testnet. Payment Execution requires x402 handshake servers. Each domain has different external dependencies.

Hardest Thing

Verifiable Intent and On-Chain Trust require real external infrastructure — FIDO authenticators, Sui testnet wallets, escrow contracts. A test that mocks the protocol is not a protocol test. The hardest part is standing up real infrastructure for protocols that don't have commodity test tooling yet.

Scorecard

Priority (5P)

5/5

Pain

4/5

Demand

5/5

Edge

5/5

Trend

3/5

Convert

Readiness (5R)

Principles4 / 5

Performance2 / 5

Platform4 / 5

Process3 / 5

Players2 / 5

What Exists

Component	State	Gap
intents-e2e test app	Working	3-tier architecture (Orchestration, Protocol, Network). 20 passing tests. Zod contract-first pattern.
A2A Protocol tests (Domain 1)	Partial	5/8 tested. Multi-agent chain, capability search, MCP missing.
Auth tests (Domain 4)	Partial	Better Auth RBAC + API Key tested. zkLogin, Worldcoin, W3C VCs not started.
Stripe webhook test (Domain 3)	Working	1/6 payment protocols tested. Payment intent, spending authority, x402 all missing.
Verifiable Intent (Domain 2)	Missing	0/6 tested. Intent capture, instruction fidelity, audit trail, identity binding all TODO or NOT STARTED.
On-Chain Trust (Domain 5)	Missing	0/4 tested. Sui escrow, settlement, attestation, data provenance all missing.
Coverage computation	Missing	Coverage is hand-counted from PROTOCOL-COVERAGE.md. No script computes percentage from test results.
HITL oversight patterns	Missing	3 oversight patterns documented but untested. Human-in-the-loop, human-on-the-loop, automated gates.
MCP documentation	Working	Full docs page with architecture, transport, lifecycle, adoption. Previously a 12-line stub.
MCP servers (active)	Working	4 MCP servers in active use (search, GitHub, database, image generation). L2 evidence — protocol is live but untested formally.

Relationships

PRD	Contributes
Agent Platform	Peer — Agent Platform is the PUMP (identity, memory, scaffolds). Protocol Coverage is the GAUGE (trust verification). Different feature IDs, same foundation.
Automated Commissioning	Enables — Protocol Coverage produces test results that Automated Commissioning consumes to compute feature states.
CLI Platform	Depends on — test commands and coverage reporting run through the unified CLI.
Prediction Game (Sui)	Prediction Game needs PROT-005 (On-Chain Trust) for atomic settlement.
Sales CRM	Sales CRM needs PROT-003 (Payment Execution) for Stripe + payment intent flows.

Kill Signal

After 60 days of active development, if protocol coverage has not moved from 41% to 60%, the contract-first approach is too slow. Switch to integration-test-only coverage or re-scope to Commerce Authorization Chain only.

Questions

If trust is computed not claimed, what happens when a protocol test fails in production — does the system degrade gracefully or does the whole Commerce Authorization Chain break?

Which of the 29 untested protocols would cause the most damage if it failed silently in a real transaction?
Is contract-first testing (Zod schemas before specs) faster or slower than writing specs first and deriving contracts?
When Verifiable Intent requires FIDO keys, is the test proving the protocol or proving the FIDO implementation?