Finance Data Flow
What entities exist, how they relate, and how a single fact moves from a raw filing into a binding decision.
Naming System
Every entity in finance is named by whoever has the most power to enforce the name. The name hides where the power actually sits.
- Security — named by the issuer; the legal definition lives in the prospectus
- Counterparty — named by the regulator; the identity check lives in the KYC file
- Position — named by the custodian; the legal owner is whoever the custodian's ledger says it is
- Cash flow — named by the accountant; the timing depends on which standard the firm applies
- Risk — named by the model owner; whichever model is signed off becomes the firm's truth
The agentic shift renames two of these. Position moves from the custodian's ledger to the wallet's signature. Counterparty moves from a legal entity to a verified-agent credential under a named human driver.
Data Model
Finance data has five entities. Every workflow is a path through them.
| Entity | Definition | Primary keys | Lives on |
|---|---|---|---|
| Issuer | Legal entity that creates the security or token | LEI, ticker, contract address | Registry or blockchain |
| Security | The instrument itself — equity, debt, token | CUSIP, ISIN, token address | Registry or contract |
| Counterparty | Who is on the other side of the trade | LEI, wallet, verified-agent credential | KYC file or chain |
| Position | The holding — who owns what, how much, when | Account ID + security ID, or wallet + token | Custodian or wallet |
| Cash flow | The economic outcome — coupon, dividend, fee, yield | Date + amount + source | GL or on-chain event |
State Transitions
A finance fact moves through five states. The state model is invariant across both rails.
- Raw — the filing, transcript, or block event arrives. Untrusted.
- Structured — mapped to the firm's schema. Each field has a source citation.
- Reconciled — cross-checked against an independent source. Variance recorded.
- Signed — qualified human approves before it binds a decision.
- Bound — the fact is now an obligation, a position, or a precedent.
Step four is the gate. The agentic shift compresses steps one through three; it does not remove step four. Step five is the only state the regulator and the auditor recognise as final.
Data Footprint — Schema to Feedback
The maturity of any finance data flow is set by how far it travels through the footprint. Most firms stall between Schema and API. The agentic shift demands all five.
| Layer | What it means in finance | Typical maturity (today) | Agentic-era requirement |
|---|---|---|---|
| Schema | Field names, types, constraints | Mature for filings; weak for on-chain | Both rails on one schema |
| Data | The values populated, with provenance | Mature for prices; weak for intent | Every fact carries its source date |
| API | Programmatic access, versioned | Mature for market data; uneven for filings | Stable, agent-accessible |
| UI | Human-readable surface for review | Mature for terminals; poor for receipts | Receipt-first, table-second |
| Feedback | The loop: model output → reality → recalibration | Weak — variance reports often unread | Continuous; agent-driver reads variance daily |
A firm that closes the Feedback layer first wins the next decade. Most firms have never tried, because the loop requires admitting the model was wrong.
Decisions Data Drives
Each entity supplies the input to a specific decision. Map the decision to the entity before building the model.
| Entity | Decision it informs | Action it triggers |
|---|---|---|
| Issuer | Should we underwrite or partner? | Coverage call, NDA, IC pre-screen |
| Security | What is it worth, and what is the risk? | Valuation, position sizing, hedge |
| Counterparty | Can we transact safely? | Onboarding, line limit, settlement instruction |
| Position | Are we exposed where we expect? | Risk report, margin call, rebalance |
| Cash flow | Did reality match the model? | Variance commentary, reforecast, post-close review |
Provenance Becomes the Audit Trail
When an agent runs the model, the receipt is the audit trail. Three fields make a receipt audit-grade:
- Inputs — every value the agent used, with a source citation and a timestamp
- Prompt + version — the exact instruction the agent received, hashed
- Output + sign-off — what the agent returned and who bound it
A receipt missing any one of these is not yet an audit trail. It is a draft.
Context
- Finance Principles — Five questions the data must serve
- Finance Performance — The metrics this data flow produces
- Naming Standards — Why naming is the first principle
Questions
What does the data layer look like when the agent is one of the readers — and one of the writers?
- Which of your five entities still lives in a system that cannot emit a receipt?
- Which decision in your firm runs on data with no provenance trail?
- Where does your variance report die before it reaches the model owner?