← Pipeline Nowcast · Prompt Deck · Pictures

Pipeline Nowcast Spec

How do you know the factory is on track before the weekly meeting tells you it isn't?

Build Contract

#	Feature	Function	Outcome	Job	State
1	Signal Collectors (5x)	Extract raw measurements from each source table	All 5 sources feeding one algorithm	Collect	Gap
2	Signal Normalization	Normalize each signal to 0-1 scale per type	Apples-to-apples comparison	Normalize	Gap
3	Exponential Decay	Weight signals by recency with configurable half-life	Recent signals matter more	Weight	Gap
4	Variance Computation	Per-signal (actual - forecast) / forecast	Know which signal is drifting	Measure	Gap
5	Composite Scoring	Weighted sum with configurable signal weights	Single number: are we on track?	Compose	Gap
6	Classification	Map composite to on_track / warning / critical	Status matches `varianceStatus` enum	Classify	Gap
7	Confidence Calculation	Score based on signal coverage and freshness	Trust the number or gather more data	Calibrate	Gap
8	Recommendations Engine	Top risk, top momentum, action list	Know what to do next, not just status	Advise	Gap
9	Constants Registration	`PIPELINE_NOWCAST` block in `algorithm-constants.ts`	All thresholds tunable without code	Configure	Gap
10	Forecast Baselines	Target values per signal type (config)	Variance has something to compare to	Baseline	Gap
11	Prediction Evidence Ledger	Store signal observations against predictions	Predictions have data, not just opinions	Evidence	Gap
12	State Transition Log	Timestamp every L0-L4 commissioning state change	Commissioning velocity measurable	Velocity	Gap
13	Internal Signal Collectors	Collect prediction evidence from own platform	Receipt accuracy, boot time, trust scores feed predictions	Introspect	Gap
14	Market Signal Collectors	Automated external scans (RWA TVL, dev counts, agent commerce)	Predictions validated against market reality	Scan	Gap
15	Bayesian Update Triggers	Threshold-based conviction reassessment protocol	Stale predictions auto-flag for review	Update	Gap
16	Resolution Protocol	Confirm or falsify predictions with timestamped evidence	Prediction database has L4 entries	Resolve	Gap

Principles

What truths constrain the design?

The Job

Element	Detail
Situation	Five signal systems built independently. Each has a dashboard. None connect.
Intention	A single composite score answering "are we on track, drifting, or in trouble?"
Obstacle	No algorithm normalizes, weights, and composes these signals with recency decay.

Why Now

All five signal sources exist in production:

CRM: 5 deals, 10 activities, $1.2M pipeline (commissioning 2026-03-02)
Agent comms: Convex event stream with typed messages
Commissioning: 74 features at L0-L3 (no velocity tracking — state transitions have no timestamps)
Predictions: 76 predictions in database, conviction-scored, zero evidence feeds

The data exists. The synthesis doesn't. Every commissioning session (45 min manual) could be a 2-second algorithm call.

Prediction validation gap: The prediction database has 76 entries with conviction scores but no live data updating them. The Bayesian protocol exists on paper (prompt-predictions.md Step 4) but has no automated triggers. Predictions without evidence feeds are opinions with timestamps. The nowcast closes this gap by treating prediction evidence as a first-class signal source — same normalization, same decay, same composite scoring.

Design Constraints

Constraint	Rationale
Pure function, no side effects	Testable, composable, matches algorithm framework
All thresholds in constants file	Tunable without code changes
Minimum 3 signals for confidence > 0.5	Prevents false confidence from sparse data
Match existing `varianceStatus` enum	Interop with prediction schemas
Exponential decay, not linear	Recent signals should dominate, old signals fade smoothly

Refusal Spec

Category	Action	Response
Insufficient signals	Fewer than 1 signal available	Return confidence: 0, status: critical, reasoning: "No signals"
Stale data	All signals older than 2x half-life	Return confidence below 0.3, flag staleness
Invalid forecasts	Forecast value is 0 or negative	Skip signal, reduce confidence, log warning

Performance

How do we know it's working?

Priority Score

Dimension	Score	Evidence
Pain	4	45 min/session manual synthesis across 5 dashboards. 76 predictions with zero evidence feeds. Drift invisible between meetings.
Demand	4	Internal demand (commissioning + prediction validation). Prediction evidence validates the thesis the whole platform depends on.
Edge	4	Proprietary signal combination (CRM + agent comms + commissioning state + prediction evidence). Nobody else has this data shape.
Trend	5	Nowcasting + prediction markets is the dominant pattern. Superforecasting discipline going mainstream. AI operations demand real-time variance.
Conversion	2	Internal tool first. Path to customer-facing via BOaaS later.
Composite	640	4 x 4 x 4 x 5 x 2. Demand up (prediction validation is existential). Trend up (prediction markets + nowcasting convergence).

Quality Targets

Metric	Target	Method
Execution time	<500ms for 5 signals	Benchmark test
Classification accuracy	Matches manual assessment 5 consecutive days	Human comparison
Signal coverage confidence	Degrades gracefully below 3 signals	Unit test with partial inputs

Eval Strategy

What	How	When
Classification accuracy	Compare nowcast status vs dream team manual assessment	Daily for first 2 weeks
Signal freshness	Check timestamp of newest signal per type	Every run
Threshold calibration	Review false positive/negative rate	Weekly for first month

Kill signal: Nowcast status disagrees with manual assessment for 5 consecutive days after 2-week calibration period. Algorithm is wrong or signals are wrong.

Platform

What do we control?

Current State

Component	Built	Wired	Working	Notes
Algorithm framework	Yes	Yes	Yes	`libs/agency/src/lib/algorithms/`
`AlgorithmMetadata` pattern	Yes	Yes	Yes	Standard export
`algorithm-constants.ts`	Yes	Yes	Yes	Extensible
CRM data (Drizzle)	Yes	Yes	Yes	5 deals, 10 activities
Convex agent messages	Yes	Yes	Yes	HTTP client working
Commissioning state	Yes	Partial	Partial	Manual markdown, needs parser. No timestamps on state transitions.
Prediction schemas	Yes	No	No	Tables empty. 76 predictions in markdown, zero in DB.
Prediction evidence ledger	No	No	No	Build — junction between predictions and data
State transition log	No	No	No	Build — timestamps on L0-L4 changes
Internal signal collectors	No	No	No	Build — receipt accuracy, sprout boot time, trust scores
Market signal collectors	No	No	No	Build — RWA TVL, dev counts, agent commerce volume
Signal normalization	No	No	No	Build
Exponential decay	No	No	No	Build
Composite scoring	No	No	No	Build
Insights UI	Yes	No	No	Components exist, no data

Build Ratio

~60% composition, ~40% new code. Prediction evidence system is net-new domain.

Algorithm Interface

Input

interface NowcastInput {
  pipeline: {
    deals: Array<{ amount: number; probability: number; stage: string; closeDate: string }>;
    targetCoverage: number; // default 3.0x
  };
  activity: {
    activities: Array<{ type: string; outcome: string; startDate: string }>;
    targetPerDealPerWeek: number; // default 2
    dealCount: number;
  };
  agentVelocity: {
    messages: Array<{ type: string; createdAt: number }>;
    plansCompleted: number;
    blockersOpen: number;
  };
  commissioning: {
    features: Array<{ name: string; currentLevel: number; forecastLevel: number }>;
  };
  predictions: {
    entries: Array<{ confidenceScore: number; accuracyScore: number | null; status: string }>;
    evidence: Array<PredictionEvidence>; // evidence ledger entries
    maturity: Array<PredictionMaturity>; // per-prediction maturity state
  };
  config?: {
    weights?: Partial<SignalWeights>;
    decayHalfLifeDays?: number;
    thresholds?: { onTrack?: number; warning?: number };
  };
}

Output

interface NowcastResult {
  result: {
    composite: number; // 0-1
    status: "on_track" | "warning" | "critical";
    signals: NowcastSignal[]; // per-signal breakdown
    topRisk: string; // highest-variance signal name
    topMomentum: string; // most-improving signal name
  };
  metadata: {
    algorithm: "pipeline-nowcast";
    version: string;
    executionTimeMs: number;
    signalCount: number;
  };
  reasoning: string[];
  confidence: number; // 0-1 based on signal coverage
  recommendations: {
    status: "on_track" | "warning" | "critical";
    actions: string[];
    signalsNeedingAttention: string[];
  };
}

interface NowcastSignal {
  name: string;
  score: number; // 0-1 normalized
  weight: number; // configured weight
  variance: number; // (actual - forecast) / forecast
  trend: "improving" | "stable" | "declining";
  freshness: number; // 0-1 decay factor
}

Constants

export const PIPELINE_NOWCAST = {
  PIPELINE_WEIGHT: 0.35,
  ACTIVITY_WEIGHT: 0.25,
  AGENT_VELOCITY_WEIGHT: 0.15,
  COMMISSIONING_WEIGHT: 0.15,
  PREDICTION_WEIGHT: 0.1,
  DECAY_HALF_LIFE_DAYS: 14,
  ON_TRACK_THRESHOLD: 0.7,
  WARNING_THRESHOLD: 0.4,
  MIN_SIGNALS_FOR_CONFIDENCE: 3,
  TARGET_PIPELINE_COVERAGE: 3.0,
  TARGET_ACTIVITY_PER_DEAL_WEEK: 2,
} as const;

Prediction Evidence Ledger

The junction table that connects predictions to data. Without this, predictions are opinions with timestamps.

interface PredictionEvidence {
  predictionId: string; // links to prediction-database.md row
  signalType: "internal" | "market" | "resolution";
  dataPoint: string; // what was measured
  value: number; // the measurement
  source: string; // where it came from
  timestamp: string; // ISO 8601
  direction: "strengthens" | "weakens" | "neutral";
  convictionDelta: number; // how much this moved the score (-1 to +1)
}

interface PredictionMaturity {
  level: "L0_stated" | "L1_instrumented" | "L2_tracked" | "L3_tested" | "L4_resolved";
  evidenceCount: number;
  lastUpdated: string;
  convictionHistory: Array<{ score: number; date: string; reason: string }>;
}

Prediction maturity model (mirrors commissioning):

State	Meaning	Criteria
L0 Stated	Conviction score assigned, no evidence collected	Default for all 76 current predictions
L1 Instrumented	Data sources identified, collection method defined	Evidence ledger row exists with source field populated
L2 Tracked	Evidence flowing, conviction updating quarterly	>= 2 evidence entries, conviction updated at least once
L3 Tested	Prediction survived 2+ Bayesian updates with evidence	>= 2 conviction changes backed by data
L4 Resolved	Confirmed or falsified with timestamped evidence	Resolution entry with pass/fail and evidence chain

State Transition Log

Timestamps on commissioning state changes enable velocity tracking — rate of L0 to L4 progression per month.

interface StateTransition {
  featureId: string; // commissioning dashboard row
  fromLevel: number; // 0-4
  toLevel: number; // 0-4
  timestamp: string; // ISO 8601
  evidence: string; // what proved the transition
  agent: string; // who/what triggered it
}

This feeds two signals: commissioning signal (#4) gets velocity data, and prediction #5 (small teams + agents) gets its own validation metric.

Signal Source Expansion

Internal signals (cost: $0, already collectible):

Signal	Source	Validates Prediction	Collection
Agent receipt intent-match rate	`.ai/receipts/` manual_interventions field	Intent Verification	Parse receipts directory
Sprout boot time	A2A acceptance test	Composable Primitives	Run and time test
ETL trust score completion	Trust scoring pipeline output	Trust Quantified	Count scored entities
Commissioning velocity	State transition log (new)	Small Teams + Agents	L0 to L4 transitions/month
Agent profile capability evidence	agent_profiles table	Identity Inversion	Count verified capabilities

Market signals (cost: $0, automated scan):

Signal	Source	Validates Prediction	Collection
Tokenized RWA total value	RWA.xyz	RWA Boring	Monthly scrape
On-chain identity adoption	Dune Analytics, Sui explorer	Identity Inversion, Trust Quantified	Monthly query
Agent commerce volume	a16z State of Crypto, Stripe reports	Intent Verification	Quarterly Perplexity scan
AI-native company revenue/employee	Crunchbase, public filings	Small Teams + Agents	Quarterly scan
Composable platform adoption	ProductHunt, YC batch analysis	Composable Primitives	Quarterly scan
NZ FMA tokenization guidance	FMA publications	RWA Boring	Quarterly check

Bayesian Update Protocol

Automated trigger schedule per prediction category:

Category	Weekly Check	Reassessment Trigger (>10% conviction shift)	Abandon Trigger
Intent Verification	Agent liability news	>$100M agent insurance products launched	Major jurisdiction bans agent transactions
Composable Primitives	YC batch composition	A2A/MCP reaches stable spec	Hyperscaler ships "Business in a Box," captures 80%
Trust Quantified	On-chain reputation TVL	Portable reputation standard proposed at W3C/SIP	Privacy backlash legislation in 3+ jurisdictions
Identity Inversion	"Portfolio hire" postings	On-chain professional identity >10M users	Deepfake receipts become trivial
Small Teams + Agents	Revenue-per-employee reports	>50 companies at <10 people, >$10M revenue	Agent reliability plateaus 18+ months
RWA Tokenization	RWA.xyz total value	NZ FMA issues formal guidance	Major jurisdiction criminalizes tokenized securities

Core Algorithm

1. For each signal type with data:
   a. Normalize raw measurement to 0-1 scale
   b. Apply exponential decay: factor = exp(-0.693 * ageDays / halfLife)
   c. Compute variance: (actual - forecast) / forecast
   d. Record trend from last 3 measurements

2. Compute confidence:
   confidence = signalsPresent / totalSignals * avgFreshness

3. Compute composite:
   composite = sum(signal.score * signal.weight * signal.freshness)
               / sum(signal.weight * signal.freshness)

4. Classify:
   >= 0.7 → on_track
   >= 0.4 → warning
   <  0.4 → critical

5. Generate recommendations:
   topRisk = signal with lowest score
   topMomentum = signal with best trend
   actions = per-signal actionable suggestions

Protocols

How does the system coordinate?

Build Order

Sprint	Features	What	Effort	Acceptance
S0	#9, #10	Constants + forecast baselines	0.5d	Constants in file, baselines documented
S1	#1, #2, #3	Signal collectors + normalization + decay	2d	Each collector returns normalized 0-1 with decay
S2	#4, #5, #6, #7	Variance + composite + classification + confidence	2d	`calculateNowcast()` returns valid NowcastResult with 5 mock signals
S3	#8	Recommendations engine	1d	topRisk, topMomentum, actions populated from signal breakdown
S4	—	Wire to production data sources	1d	Real signals flowing, composite rendered in Insights
S5	#11, #12	Prediction evidence ledger + state transition log	2d	Evidence table accepts entries, transitions timestamped
S6	#13	Internal signal collectors	1d	Receipt accuracy, sprout time, trust scores flowing to ledger
S7	#14	Market signal collectors (automated scans)	2d	Monthly scan pipeline produces evidence entries for 6 predictions
S8	#15, #16	Bayesian triggers + resolution protocol	1d	Stale predictions flagged, resolved predictions at L4

Commissioning

#	Feature	Install	Test	Operational	Optimize
1	Signal Collectors (5x)	---	---	---	---
2	Signal Normalization	---	---	---	---
3	Exponential Decay	---	---	---	---
4	Variance Computation	---	---	---	---
5	Composite Scoring	---	---	---	---
6	Classification	---	---	---	---
7	Confidence Calculation	---	---	---	---
8	Recommendations Engine	---	---	---	---
9	Constants Registration	---	---	---	---
10	Forecast Baselines	---	---	---	---
11	Prediction Evidence Ledger	---	---	---	---
12	State Transition Log	---	---	---	---
13	Internal Signal Collectors	---	---	---	---
14	Market Signal Collectors	---	---	---	---
15	Bayesian Update Triggers	---	---	---	---
16	Resolution Protocol	---	---	---	---

Agent-Facing Spec

Commands: pnpm test -- --filter=pipeline-nowcast, pnpm tc

Boundaries:

Always: pure function, no DB writes, no side effects
Ask first: threshold changes, weight adjustments
Never: modify signal source data, bypass confidence check

Test Contract:

#	Feature	Test File	Assertion
1	Signal collectors	`pipeline-nowcast.test.ts`	Each collector returns 0-1 from valid input
2	Normalization	`pipeline-nowcast.test.ts`	Edge cases: zero, negative, very large values
3	Exponential decay	`pipeline-nowcast.test.ts`	14-day-old signal at 50% weight
4	Composite scoring	`pipeline-nowcast.test.ts`	5 signals produce weighted sum
5	Classification	`pipeline-nowcast.test.ts`	Boundary: 0.7 on_track, 0.4 warning
6	Confidence	`pipeline-nowcast.test.ts`	2 of 5 signals = confidence < 0.5
7	Recommendations	`pipeline-nowcast.test.ts`	topRisk = lowest-scoring signal
8	Cold start	`pipeline-nowcast.test.ts`	0 signals = confidence 0, status critical
9	Evidence ledger	`prediction-evidence.test.ts`	Evidence entry updates prediction maturity
10	State transitions	`prediction-evidence.test.ts`	Transition timestamps enable velocity calc
11	Bayesian triggers	`prediction-evidence.test.ts`	>10% conviction shift flags reassessment
12	Resolution	`prediction-evidence.test.ts`	Resolved prediction reaches L4 with evidence
13	Market collector	`prediction-evidence.test.ts`	Scan returns structured evidence from source
14	Internal collector	`prediction-evidence.test.ts`	Receipt parse returns intent-match rate

Players

Who creates harmony?

Job 1: Know If We're On Track

Element	Detail
Struggling moment	Weekly commissioning session: 45 min checking 5 dashboards, forming a mental picture
Workaround	Manual synthesis, gut feel, "seems fine" until it isn't
Progress	Glance at one composite score, see which signal is drifting, act on the recommendation
Hidden objection	"A single number can't capture this complexity"
Switch trigger	Missed a regression that was visible in the data 3 days earlier

Features that serve this job: #5, #6, #8

Job 2: Detect Drift Early

Element	Detail
Struggling moment	Problem compounds silently between reviews. Activity drops, nobody notices for a week.
Workaround	Hope someone checks. Rely on agent posting to #meta.
Progress	Nowcast fires warning when activity velocity drops below threshold, before the weekly review
Hidden objection	"False alarms are worse than no alarms"
Switch trigger	A deal went cold because nobody noticed zero activity for 10 days

Features that serve this job: #3, #4, #7

Job 3: Validate Predictions With Evidence

Element	Detail
Struggling moment	76 predictions in a markdown table. Conviction scores assigned once, never updated. No evidence.
Workaround	Quarterly manual review, gut-feel conviction updates, no data to support or falsify.
Progress	Evidence ledger collects signals. Bayesian triggers flag stale predictions. Maturity model tracks.
Hidden objection	"Predictions are inherently uncertain — instrumenting them is false precision"
Switch trigger	A prediction you scored 4/5 turns out wrong and you never collected the evidence that would've told you 6 months earlier

Features that serve this job: #11, #13, #14, #15, #16

Relationship to Other PRDs

PRD	Relationship	Data Flow
Sales CRM & RFP	Peer	Pipeline + activity signals flow IN to nowcast
Agent Platform	Peer	Agent velocity signals flow IN from Convex. Receipt accuracy feeds prediction evidence.
ETL Data Tool	Peer (upstream)	Trust scoring completion feeds prediction evidence. Market signal collection reuses ETL patterns.
Sales Process Optimisation	Peer (downstream)	Nowcast output could feed SPO decision engine
Prediction Database	Data source	76 predictions flow IN. Evidence flows back. Maturity state tracked.
Sui Real Estate Tokenization	Evidence subject	RWA TVL market signal validates tokenization prediction

Context

PRD Index — Pipeline Nowcast
Prompt Deck — 5-card pitch
Pictures — Pre-flight maps

Build Contract​

Principles​

The Job​

Why Now​

Design Constraints​

Refusal Spec​

Performance​

Priority Score​

Quality Targets​

Eval Strategy​

Platform​

Current State​

Build Ratio​

Algorithm Interface​

Input​

Output​

Constants​

Prediction Evidence Ledger​

State Transition Log​

Signal Source Expansion​

Bayesian Update Protocol​

Core Algorithm​

Protocols​

Build Order​

Commissioning​

Agent-Facing Spec​

Players​

Job 1: Know If We're On Track​

Job 2: Detect Drift Early​

Job 3: Validate Predictions With Evidence​

Relationship to Other PRDs​

Context​