Skip to main content

Advertising Data Flow

What data is critical in driving decision making in advertising?

Terminology

Advertising has its own nomenclature. The naming conventions reveal where power concentrates and where it shifts.

TermWhat It MeansWho Named ItWhat It Hides
ImpressionAn ad was served to a screenPublishersWhether a human saw it, or a bot loaded it
ClickSomeone interacted with the adPlatformsWhether intent was real or accidental
ConversionA desired action completedAdvertisersWhich touchpoint actually caused it
AttributionCredit assigned to a touchpointMeasurementThe model's assumptions about causation
ReachNumber of unique users exposedPlatformsSelf-reported, grading their own homework
ROASRevenue per dollar of ad spendPerformanceWhether the revenue would have happened anyway
CPMCost per thousand impressionsMedia buyersThat "impression" and "attention" are different things
ReconciliationMatching delivery to paymentFinance30-90 days of uncertainty treated as normal

Web3 renaming "impression" to "verified attestation" and "reconciliation" to "settlement" to reflect shifts ownership.

The Data Model

Understanding the industry starts with understanding its data footprint — the entities, relationships, and state transitions that drive every decision. This IS the knowledge schema for advertising.

EntityRelationshipsState Transitions
IdentityJoins impressions to conversions, owns audience membershipAnonymous → Cookied → Opted-in → Wallet-linked
ImpressionBelongs to campaign, targets identity, serves creativeServed → Viewed → Clicked → Converted → Settled
CampaignHas budget, targets audience, deploys creativesBriefed → Targeting → Live → Optimizing → Reconciled
CreativeBelongs to campaign, served via impressionCreated → Tested → Winner/Loser → Fatigued
ConversionAttributed to impression(s), generates revenueDetected → Attributed → Verified → Settled
SettlementJoins impression + conversion + paymentInvoiced → Disputed → Reconciled (30-90d) or Instant

Every principle, metric, protocol, tool, and player operates on these six entities. The entire $1 trillion industry exists to make the Identity ↔ Conversion join work. Apple broke that join. Wallets restore it — with ownership inverted.

Data Flow

How data moves through the advertising system — from intent to settlement.

ADVERTISER                    PLATFORM                      PUBLISHER
│ │ │
├─ Campaign briefed ────────►│ │
│ (budget, audience, goal) │ │
│ ├─ Audience matched ──────────►│
│ │ (identity join) │
│ │ ├─ Impression served
│ │ │ (creative rendered)
│ │◄─ Signal returned ───────────┤
│ │ (view, click, conversion) │
│◄─ Attribution reported ────┤ │
│ (which touchpoints) │ │
│ │ │
└─ Settlement (30-90 days) ──┼──────────────────────────────┘

The critical join: Step 2 (Audience matched) requires the identity entity. When Apple's ATT removed the cross-site join key, the entire flow from "matched" to "attributed" broke for ~30% of users.

Web3 alternative: The wallet IS the identity. The blockchain IS the attribution database. Settlement is atomic — impression + verification + payment in one PTB.

Data Footprint

Map against the commissioning instrument — what maturity exists across five layers?

Data DomainSchemaDataAPIUIFeedback LoopMoat Signal
IdentityProvenProvenProvenProvenProvenPlatform lock-in (Google, Meta)
ImpressionProvenProvenProvenProvenExistsSelf-reported, unverified
AudienceProvenProvenProvenProvenExistsFirst-party data = competitive edge
AttributionExistsExistsExistsExistsNoneBroken — last-click default
CreativeProvenProvenProvenProvenExistsAI-generated, fast commoditizing
SettlementExistsExistsNoneNoneNone30-90 day delay = hidden fee
VerificationNoneNoneNoneNoneNoneThe gap Web3 fills

Where the moat lives: Identity and Audience — whoever owns the join between these two owns the margin. Google and Meta's moat is not their ad serving technology. It's their identity graph.

Where the gap lives: Attribution, Settlement, and Verification — three domains where data is either broken, delayed, or nonexistent. This is the Web3 opportunity.

Decisions Data Drives

DecisionData RequiredEntityCurrent QualityImpact of Bad Data
Where to spendChannel ROAS, incrementalityCampaignMediumBudget allocated to wrong channels
Who to targetAudience segments, lookalikesIdentityDecliningWasted impressions on wrong people
What creative to runA/B test results, fatigue signalsCreativeHighLow CTR, creative burnout
When to stopFrequency caps, diminishing returnsImpressionMediumAd fatigue, negative brand impact
What actually workedMulti-touch attribution, lift testingConversionLowWrong channel gets credit
How much to payVerified delivery, fraud detectionSettlementLowPaying for bot traffic

The bottom two rows — attribution and settlement — have the lowest data quality and the highest financial impact. That's not a coincidence. It's a business model.

Context

Questions

If whoever names the entities owns the ontology, and whoever owns the ontology controls value flow — what happens when users rename "tracking" to "my data"?

  • Which row in the data footprint table would you fix first — and does your answer reveal whether you're an advertiser, publisher, or platform?
  • When Attribution has "None" for feedback loop, what compounds instead of learning?
  • If Settlement moved from 30-90 days to sub-second, which intermediaries exist only because of the delay?
  • What would the naming system look like if users designed it instead of platforms?