Advertising Data Flow
What data is critical in driving decision making in advertising?
Terminology
Advertising has its own nomenclature. The naming conventions reveal where power concentrates and where it shifts.
| Term | What It Means | Who Named It | What It Hides |
|---|---|---|---|
| Impression | An ad was served to a screen | Publishers | Whether a human saw it, or a bot loaded it |
| Click | Someone interacted with the ad | Platforms | Whether intent was real or accidental |
| Conversion | A desired action completed | Advertisers | Which touchpoint actually caused it |
| Attribution | Credit assigned to a touchpoint | Measurement | The model's assumptions about causation |
| Reach | Number of unique users exposed | Platforms | Self-reported, grading their own homework |
| ROAS | Revenue per dollar of ad spend | Performance | Whether the revenue would have happened anyway |
| CPM | Cost per thousand impressions | Media buyers | That "impression" and "attention" are different things |
| Reconciliation | Matching delivery to payment | Finance | 30-90 days of uncertainty treated as normal |
Web3 renaming "impression" to "verified attestation" and "reconciliation" to "settlement" to reflect shifts ownership.
The Data Model
Understanding the industry starts with understanding its data footprint — the entities, relationships, and state transitions that drive every decision. This IS the knowledge schema for advertising.
| Entity | Relationships | State Transitions |
|---|---|---|
| Identity | Joins impressions to conversions, owns audience membership | Anonymous → Cookied → Opted-in → Wallet-linked |
| Impression | Belongs to campaign, targets identity, serves creative | Served → Viewed → Clicked → Converted → Settled |
| Campaign | Has budget, targets audience, deploys creatives | Briefed → Targeting → Live → Optimizing → Reconciled |
| Creative | Belongs to campaign, served via impression | Created → Tested → Winner/Loser → Fatigued |
| Conversion | Attributed to impression(s), generates revenue | Detected → Attributed → Verified → Settled |
| Settlement | Joins impression + conversion + payment | Invoiced → Disputed → Reconciled (30-90d) or Instant |
Every principle, metric, protocol, tool, and player operates on these six entities. The entire $1 trillion industry exists to make the Identity ↔ Conversion join work. Apple broke that join. Wallets restore it — with ownership inverted.
Data Flow
How data moves through the advertising system — from intent to settlement.
ADVERTISER PLATFORM PUBLISHER
│ │ │
├─ Campaign briefed ────────►│ │
│ (budget, audience, goal) │ │
│ ├─ Audience matched ──────────►│
│ │ (identity join) │
│ │ ├─ Impression served
│ │ │ (creative rendered)
│ │◄─ Signal returned ───────────┤
│ │ (view, click, conversion) │
│◄─ Attribution reported ────┤ │
│ (which touchpoints) │ │
│ │ │
└─ Settlement (30-90 days) ──┼──────────────────────────────┘
The critical join: Step 2 (Audience matched) requires the identity entity. When Apple's ATT removed the cross-site join key, the entire flow from "matched" to "attributed" broke for ~30% of users.
Web3 alternative: The wallet IS the identity. The blockchain IS the attribution database. Settlement is atomic — impression + verification + payment in one PTB.
Data Footprint
Map against the commissioning instrument — what maturity exists across five layers?
| Data Domain | Schema | Data | API | UI | Feedback Loop | Moat Signal |
|---|---|---|---|---|---|---|
| Identity | Proven | Proven | Proven | Proven | Proven | Platform lock-in (Google, Meta) |
| Impression | Proven | Proven | Proven | Proven | Exists | Self-reported, unverified |
| Audience | Proven | Proven | Proven | Proven | Exists | First-party data = competitive edge |
| Attribution | Exists | Exists | Exists | Exists | None | Broken — last-click default |
| Creative | Proven | Proven | Proven | Proven | Exists | AI-generated, fast commoditizing |
| Settlement | Exists | Exists | None | None | None | 30-90 day delay = hidden fee |
| Verification | None | None | None | None | None | The gap Web3 fills |
Where the moat lives: Identity and Audience — whoever owns the join between these two owns the margin. Google and Meta's moat is not their ad serving technology. It's their identity graph.
Where the gap lives: Attribution, Settlement, and Verification — three domains where data is either broken, delayed, or nonexistent. This is the Web3 opportunity.
Decisions Data Drives
| Decision | Data Required | Entity | Current Quality | Impact of Bad Data |
|---|---|---|---|---|
| Where to spend | Channel ROAS, incrementality | Campaign | Medium | Budget allocated to wrong channels |
| Who to target | Audience segments, lookalikes | Identity | Declining | Wasted impressions on wrong people |
| What creative to run | A/B test results, fatigue signals | Creative | High | Low CTR, creative burnout |
| When to stop | Frequency caps, diminishing returns | Impression | Medium | Ad fatigue, negative brand impact |
| What actually worked | Multi-touch attribution, lift testing | Conversion | Low | Wrong channel gets credit |
| How much to pay | Verified delivery, fraud detection | Settlement | Low | Paying for bot traffic |
The bottom two rows — attribution and settlement — have the lowest data quality and the highest financial impact. That's not a coincidence. It's a business model.
Context
- Advertising Industry — The FACT hub
- Advertising Principles — The five immutable truths
- Naming Standards — Taxonomy, nomenclature, ontology
- Data Footprint — The commissioning instrument
- Knowledge Schema — Schemas influence attention
- First Principles — Nomenclature is the first principle
Links
- IAB Tech Lab — OpenRTB — The industry's data standard for real-time bidding
- ads.txt — Authorized digital sellers standard
- VAST Standard — Video ad serving template
- Antonio García Martínez — Chaos Monkeys — Inside Facebook's ad machine
Questions
If whoever names the entities owns the ontology, and whoever owns the ontology controls value flow — what happens when users rename "tracking" to "my data"?
- Which row in the data footprint table would you fix first — and does your answer reveal whether you're an advertiser, publisher, or platform?
- When Attribution has "None" for feedback loop, what compounds instead of learning?
- If Settlement moved from 30-90 days to sub-second, which intermediaries exist only because of the delay?
- What would the naming system look like if users designed it instead of platforms?