Skip to main content

Pipeline Nowcast Spec

How do you know the factory is on track before the weekly meeting tells you it isn't?

Build Contract

#FeatureFunctionOutcomeJobState
1Signal Collectors (5x)Extract raw measurements from each source tableAll 5 sources feeding one algorithmCollectGap
2Signal NormalizationNormalize each signal to 0-1 scale per typeApples-to-apples comparisonNormalizeGap
3Exponential DecayWeight signals by recency with configurable half-lifeRecent signals matter moreWeightGap
4Variance ComputationPer-signal (actual - forecast) / forecastKnow which signal is driftingMeasureGap
5Composite ScoringWeighted sum with configurable signal weightsSingle number: are we on track?ComposeGap
6ClassificationMap composite to on_track / warning / criticalStatus matches varianceStatus enumClassifyGap
7Confidence CalculationScore based on signal coverage and freshnessTrust the number or gather more dataCalibrateGap
8Recommendations EngineTop risk, top momentum, action listKnow what to do next, not just statusAdviseGap
9Constants RegistrationPIPELINE_NOWCAST block in algorithm-constants.tsAll thresholds tunable without codeConfigureGap
10Forecast BaselinesTarget values per signal type (config)Variance has something to compare toBaselineGap
11Prediction Evidence LedgerStore signal observations against predictionsPredictions have data, not just opinionsEvidenceGap
12State Transition LogTimestamp every L0-L4 commissioning state changeCommissioning velocity measurableVelocityGap
13Internal Signal CollectorsCollect prediction evidence from own platformReceipt accuracy, boot time, trust scores feed predictionsIntrospectGap
14Market Signal CollectorsAutomated external scans (RWA TVL, dev counts, agent commerce)Predictions validated against market realityScanGap
15Bayesian Update TriggersThreshold-based conviction reassessment protocolStale predictions auto-flag for reviewUpdateGap
16Resolution ProtocolConfirm or falsify predictions with timestamped evidencePrediction database has L4 entriesResolveGap

Principles

What truths constrain the design?

The Job

ElementDetail
SituationFive signal systems built independently. Each has a dashboard. None connect.
IntentionA single composite score answering "are we on track, drifting, or in trouble?"
ObstacleNo algorithm normalizes, weights, and composes these signals with recency decay.

Why Now

All five signal sources exist in production:

  • CRM: 5 deals, 10 activities, $1.2M pipeline (commissioning 2026-03-02)
  • Agent comms: Convex event stream with typed messages
  • Commissioning: 74 features at L0-L3 (no velocity tracking — state transitions have no timestamps)
  • Predictions: 76 predictions in database, conviction-scored, zero evidence feeds

The data exists. The synthesis doesn't. Every commissioning session (45 min manual) could be a 2-second algorithm call.

Prediction validation gap: The prediction database has 76 entries with conviction scores but no live data updating them. The Bayesian protocol exists on paper (prompt-predictions.md Step 4) but has no automated triggers. Predictions without evidence feeds are opinions with timestamps. The nowcast closes this gap by treating prediction evidence as a first-class signal source — same normalization, same decay, same composite scoring.

Design Constraints

ConstraintRationale
Pure function, no side effectsTestable, composable, matches algorithm framework
All thresholds in constants fileTunable without code changes
Minimum 3 signals for confidence > 0.5Prevents false confidence from sparse data
Match existing varianceStatus enumInterop with prediction schemas
Exponential decay, not linearRecent signals should dominate, old signals fade smoothly

Refusal Spec

CategoryActionResponse
Insufficient signalsFewer than 1 signal availableReturn confidence: 0, status: critical, reasoning: "No signals"
Stale dataAll signals older than 2x half-lifeReturn confidence below 0.3, flag staleness
Invalid forecastsForecast value is 0 or negativeSkip signal, reduce confidence, log warning

Performance

How do we know it's working?

Priority Score

DimensionScoreEvidence
Pain445 min/session manual synthesis across 5 dashboards. 76 predictions with zero evidence feeds. Drift invisible between meetings.
Demand4Internal demand (commissioning + prediction validation). Prediction evidence validates the thesis the whole platform depends on.
Edge4Proprietary signal combination (CRM + agent comms + commissioning state + prediction evidence). Nobody else has this data shape.
Trend5Nowcasting + prediction markets is the dominant pattern. Superforecasting discipline going mainstream. AI operations demand real-time variance.
Conversion2Internal tool first. Path to customer-facing via BOaaS later.
Composite6404 x 4 x 4 x 5 x 2. Demand up (prediction validation is existential). Trend up (prediction markets + nowcasting convergence).

Quality Targets

MetricTargetMethod
Execution time<500ms for 5 signalsBenchmark test
Classification accuracyMatches manual assessment 5 consecutive daysHuman comparison
Signal coverage confidenceDegrades gracefully below 3 signalsUnit test with partial inputs

Eval Strategy

WhatHowWhen
Classification accuracyCompare nowcast status vs dream team manual assessmentDaily for first 2 weeks
Signal freshnessCheck timestamp of newest signal per typeEvery run
Threshold calibrationReview false positive/negative rateWeekly for first month

Kill signal: Nowcast status disagrees with manual assessment for 5 consecutive days after 2-week calibration period. Algorithm is wrong or signals are wrong.


Platform

What do we control?

Current State

ComponentBuiltWiredWorkingNotes
Algorithm frameworkYesYesYeslibs/agency/src/lib/algorithms/
AlgorithmMetadata patternYesYesYesStandard export
algorithm-constants.tsYesYesYesExtensible
CRM data (Drizzle)YesYesYes5 deals, 10 activities
Convex agent messagesYesYesYesHTTP client working
Commissioning stateYesPartialPartialManual markdown, needs parser. No timestamps on state transitions.
Prediction schemasYesNoNoTables empty. 76 predictions in markdown, zero in DB.
Prediction evidence ledgerNoNoNoBuild — junction between predictions and data
State transition logNoNoNoBuild — timestamps on L0-L4 changes
Internal signal collectorsNoNoNoBuild — receipt accuracy, sprout boot time, trust scores
Market signal collectorsNoNoNoBuild — RWA TVL, dev counts, agent commerce volume
Signal normalizationNoNoNoBuild
Exponential decayNoNoNoBuild
Composite scoringNoNoNoBuild
Insights UIYesNoNoComponents exist, no data

Build Ratio

~60% composition, ~40% new code. Prediction evidence system is net-new domain.

Algorithm Interface

Input

interface NowcastInput {
pipeline: {
deals: Array<{ amount: number; probability: number; stage: string; closeDate: string }>;
targetCoverage: number; // default 3.0x
};
activity: {
activities: Array<{ type: string; outcome: string; startDate: string }>;
targetPerDealPerWeek: number; // default 2
dealCount: number;
};
agentVelocity: {
messages: Array<{ type: string; createdAt: number }>;
plansCompleted: number;
blockersOpen: number;
};
commissioning: {
features: Array<{ name: string; currentLevel: number; forecastLevel: number }>;
};
predictions: {
entries: Array<{ confidenceScore: number; accuracyScore: number | null; status: string }>;
evidence: Array<PredictionEvidence>; // evidence ledger entries
maturity: Array<PredictionMaturity>; // per-prediction maturity state
};
config?: {
weights?: Partial<SignalWeights>;
decayHalfLifeDays?: number;
thresholds?: { onTrack?: number; warning?: number };
};
}

Output

interface NowcastResult {
result: {
composite: number; // 0-1
status: "on_track" | "warning" | "critical";
signals: NowcastSignal[]; // per-signal breakdown
topRisk: string; // highest-variance signal name
topMomentum: string; // most-improving signal name
};
metadata: {
algorithm: "pipeline-nowcast";
version: string;
executionTimeMs: number;
signalCount: number;
};
reasoning: string[];
confidence: number; // 0-1 based on signal coverage
recommendations: {
status: "on_track" | "warning" | "critical";
actions: string[];
signalsNeedingAttention: string[];
};
}

interface NowcastSignal {
name: string;
score: number; // 0-1 normalized
weight: number; // configured weight
variance: number; // (actual - forecast) / forecast
trend: "improving" | "stable" | "declining";
freshness: number; // 0-1 decay factor
}

Constants

export const PIPELINE_NOWCAST = {
PIPELINE_WEIGHT: 0.35,
ACTIVITY_WEIGHT: 0.25,
AGENT_VELOCITY_WEIGHT: 0.15,
COMMISSIONING_WEIGHT: 0.15,
PREDICTION_WEIGHT: 0.1,
DECAY_HALF_LIFE_DAYS: 14,
ON_TRACK_THRESHOLD: 0.7,
WARNING_THRESHOLD: 0.4,
MIN_SIGNALS_FOR_CONFIDENCE: 3,
TARGET_PIPELINE_COVERAGE: 3.0,
TARGET_ACTIVITY_PER_DEAL_WEEK: 2,
} as const;

Prediction Evidence Ledger

The junction table that connects predictions to data. Without this, predictions are opinions with timestamps.

interface PredictionEvidence {
predictionId: string; // links to prediction-database.md row
signalType: "internal" | "market" | "resolution";
dataPoint: string; // what was measured
value: number; // the measurement
source: string; // where it came from
timestamp: string; // ISO 8601
direction: "strengthens" | "weakens" | "neutral";
convictionDelta: number; // how much this moved the score (-1 to +1)
}

interface PredictionMaturity {
level: "L0_stated" | "L1_instrumented" | "L2_tracked" | "L3_tested" | "L4_resolved";
evidenceCount: number;
lastUpdated: string;
convictionHistory: Array<{ score: number; date: string; reason: string }>;
}

Prediction maturity model (mirrors commissioning):

StateMeaningCriteria
L0 StatedConviction score assigned, no evidence collectedDefault for all 76 current predictions
L1 InstrumentedData sources identified, collection method definedEvidence ledger row exists with source field populated
L2 TrackedEvidence flowing, conviction updating quarterly>= 2 evidence entries, conviction updated at least once
L3 TestedPrediction survived 2+ Bayesian updates with evidence>= 2 conviction changes backed by data
L4 ResolvedConfirmed or falsified with timestamped evidenceResolution entry with pass/fail and evidence chain

State Transition Log

Timestamps on commissioning state changes enable velocity tracking — rate of L0 to L4 progression per month.

interface StateTransition {
featureId: string; // commissioning dashboard row
fromLevel: number; // 0-4
toLevel: number; // 0-4
timestamp: string; // ISO 8601
evidence: string; // what proved the transition
agent: string; // who/what triggered it
}

This feeds two signals: commissioning signal (#4) gets velocity data, and prediction #5 (small teams + agents) gets its own validation metric.

Signal Source Expansion

Internal signals (cost: $0, already collectible):

SignalSourceValidates PredictionCollection
Agent receipt intent-match rate.ai/receipts/ manual_interventions fieldIntent VerificationParse receipts directory
Sprout boot timeA2A acceptance testComposable PrimitivesRun and time test
ETL trust score completionTrust scoring pipeline outputTrust QuantifiedCount scored entities
Commissioning velocityState transition log (new)Small Teams + AgentsL0 to L4 transitions/month
Agent profile capability evidenceagent_profiles tableIdentity InversionCount verified capabilities

Market signals (cost: $0, automated scan):

SignalSourceValidates PredictionCollection
Tokenized RWA total valueRWA.xyzRWA BoringMonthly scrape
On-chain identity adoptionDune Analytics, Sui explorerIdentity Inversion, Trust QuantifiedMonthly query
Agent commerce volumea16z State of Crypto, Stripe reportsIntent VerificationQuarterly Perplexity scan
AI-native company revenue/employeeCrunchbase, public filingsSmall Teams + AgentsQuarterly scan
Composable platform adoptionProductHunt, YC batch analysisComposable PrimitivesQuarterly scan
NZ FMA tokenization guidanceFMA publicationsRWA BoringQuarterly check

Bayesian Update Protocol

Automated trigger schedule per prediction category:

CategoryWeekly CheckReassessment Trigger (>10% conviction shift)Abandon Trigger
Intent VerificationAgent liability news>$100M agent insurance products launchedMajor jurisdiction bans agent transactions
Composable PrimitivesYC batch compositionA2A/MCP reaches stable specHyperscaler ships "Business in a Box," captures 80%
Trust QuantifiedOn-chain reputation TVLPortable reputation standard proposed at W3C/SIPPrivacy backlash legislation in 3+ jurisdictions
Identity Inversion"Portfolio hire" postingsOn-chain professional identity >10M usersDeepfake receipts become trivial
Small Teams + AgentsRevenue-per-employee reports>50 companies at <10 people, >$10M revenueAgent reliability plateaus 18+ months
RWA TokenizationRWA.xyz total valueNZ FMA issues formal guidanceMajor jurisdiction criminalizes tokenized securities

Core Algorithm

1. For each signal type with data:
a. Normalize raw measurement to 0-1 scale
b. Apply exponential decay: factor = exp(-0.693 * ageDays / halfLife)
c. Compute variance: (actual - forecast) / forecast
d. Record trend from last 3 measurements

2. Compute confidence:
confidence = signalsPresent / totalSignals * avgFreshness

3. Compute composite:
composite = sum(signal.score * signal.weight * signal.freshness)
/ sum(signal.weight * signal.freshness)

4. Classify:
>= 0.7 → on_track
>= 0.4 → warning
< 0.4 → critical

5. Generate recommendations:
topRisk = signal with lowest score
topMomentum = signal with best trend
actions = per-signal actionable suggestions

Protocols

How does the system coordinate?

Build Order

SprintFeaturesWhatEffortAcceptance
S0#9, #10Constants + forecast baselines0.5dConstants in file, baselines documented
S1#1, #2, #3Signal collectors + normalization + decay2dEach collector returns normalized 0-1 with decay
S2#4, #5, #6, #7Variance + composite + classification + confidence2dcalculateNowcast() returns valid NowcastResult with 5 mock signals
S3#8Recommendations engine1dtopRisk, topMomentum, actions populated from signal breakdown
S4Wire to production data sources1dReal signals flowing, composite rendered in Insights
S5#11, #12Prediction evidence ledger + state transition log2dEvidence table accepts entries, transitions timestamped
S6#13Internal signal collectors1dReceipt accuracy, sprout time, trust scores flowing to ledger
S7#14Market signal collectors (automated scans)2dMonthly scan pipeline produces evidence entries for 6 predictions
S8#15, #16Bayesian triggers + resolution protocol1dStale predictions flagged, resolved predictions at L4

Commissioning

#FeatureInstallTestOperationalOptimize
1Signal Collectors (5x)------------
2Signal Normalization------------
3Exponential Decay------------
4Variance Computation------------
5Composite Scoring------------
6Classification------------
7Confidence Calculation------------
8Recommendations Engine------------
9Constants Registration------------
10Forecast Baselines------------
11Prediction Evidence Ledger------------
12State Transition Log------------
13Internal Signal Collectors------------
14Market Signal Collectors------------
15Bayesian Update Triggers------------
16Resolution Protocol------------

Agent-Facing Spec

Commands: pnpm test -- --filter=pipeline-nowcast, pnpm tc

Boundaries:

  • Always: pure function, no DB writes, no side effects
  • Ask first: threshold changes, weight adjustments
  • Never: modify signal source data, bypass confidence check

Test Contract:

#FeatureTest FileAssertion
1Signal collectorspipeline-nowcast.test.tsEach collector returns 0-1 from valid input
2Normalizationpipeline-nowcast.test.tsEdge cases: zero, negative, very large values
3Exponential decaypipeline-nowcast.test.ts14-day-old signal at 50% weight
4Composite scoringpipeline-nowcast.test.ts5 signals produce weighted sum
5Classificationpipeline-nowcast.test.tsBoundary: 0.7 on_track, 0.4 warning
6Confidencepipeline-nowcast.test.ts2 of 5 signals = confidence < 0.5
7Recommendationspipeline-nowcast.test.tstopRisk = lowest-scoring signal
8Cold startpipeline-nowcast.test.ts0 signals = confidence 0, status critical
9Evidence ledgerprediction-evidence.test.tsEvidence entry updates prediction maturity
10State transitionsprediction-evidence.test.tsTransition timestamps enable velocity calc
11Bayesian triggersprediction-evidence.test.ts>10% conviction shift flags reassessment
12Resolutionprediction-evidence.test.tsResolved prediction reaches L4 with evidence
13Market collectorprediction-evidence.test.tsScan returns structured evidence from source
14Internal collectorprediction-evidence.test.tsReceipt parse returns intent-match rate

Players

Who creates harmony?

Job 1: Know If We're On Track

ElementDetail
Struggling momentWeekly commissioning session: 45 min checking 5 dashboards, forming a mental picture
WorkaroundManual synthesis, gut feel, "seems fine" until it isn't
ProgressGlance at one composite score, see which signal is drifting, act on the recommendation
Hidden objection"A single number can't capture this complexity"
Switch triggerMissed a regression that was visible in the data 3 days earlier

Features that serve this job: #5, #6, #8

Job 2: Detect Drift Early

ElementDetail
Struggling momentProblem compounds silently between reviews. Activity drops, nobody notices for a week.
WorkaroundHope someone checks. Rely on agent posting to #meta.
ProgressNowcast fires warning when activity velocity drops below threshold, before the weekly review
Hidden objection"False alarms are worse than no alarms"
Switch triggerA deal went cold because nobody noticed zero activity for 10 days

Features that serve this job: #3, #4, #7

Job 3: Validate Predictions With Evidence

ElementDetail
Struggling moment76 predictions in a markdown table. Conviction scores assigned once, never updated. No evidence.
WorkaroundQuarterly manual review, gut-feel conviction updates, no data to support or falsify.
ProgressEvidence ledger collects signals. Bayesian triggers flag stale predictions. Maturity model tracks.
Hidden objection"Predictions are inherently uncertain — instrumenting them is false precision"
Switch triggerA prediction you scored 4/5 turns out wrong and you never collected the evidence that would've told you 6 months earlier

Features that serve this job: #11, #13, #14, #15, #16


Relationship to Other PRDs

PRDRelationshipData Flow
Sales CRM & RFPPeerPipeline + activity signals flow IN to nowcast
Agent PlatformPeerAgent velocity signals flow IN from Convex. Receipt accuracy feeds prediction evidence.
ETL Data ToolPeer (upstream)Trust scoring completion feeds prediction evidence. Market signal collection reuses ETL patterns.
Sales Process OptimisationPeer (downstream)Nowcast output could feed SPO decision engine
Prediction DatabaseData source76 predictions flow IN. Evidence flows back. Maturity state tracked.
Sui Real Estate TokenizationEvidence subjectRWA TVL market signal validates tokenization prediction

Context