Superforecaster

What separates the top 2% of forecasters from everyone else?

Not intelligence. Not access. Superforecasters decompose better, update faster, and track everything. Philip Tetlock's research proved it: ordinary people who follow the right process consistently outperform credentialed experts who don't.

The Ten Commandments

Tetlock's rules, compressed. Each maps to a cognitive trap it prevents.

#	Commandment	Trap Prevented
1	Triage — Focus on questions where effort improves accuracy	Wasting calibration on the unknowable
2	Decompose — Break every question into sub-components	Gut-feel masquerading as analysis
3	Outside View — Start with the base rate before adjusting	Anchoring to the vivid, ignoring the statistical
4	Inside View — Then adjust for what makes this case unique	Ignoring specifics that matter
5	Synthesize — Combine outside and inside views deliberately	Defaulting to one lens
6	Update — Change probabilities when evidence changes	Belief persistence, ego protection
7	Balance — Not too much, not too little revision	Overreaction to noise or underreaction to signal
8	Hunt Errors — Actively seek what would prove you wrong	Confirmation bias
9	Team — Use disagreement as signal, not threat	Groupthink
10	Balance Again — Confidence and humility in equal measure	Overconfidence or paralysis

The master principle: perpetual beta. Every belief is a hypothesis under test.

Decomposition

The superforecaster's primary weapon. Fermi estimation applied to the future.

The three-view protocol:

OUTSIDE VIEW (base rate)
  "How often does this type of thing happen?"
  Historical frequency, reference class, statistical default.

    +

INSIDE VIEW (domain signals)
  "What makes this case different?"
  Current evidence, unique factors, acceleration/deceleration signals.

    =

SYNTHESIS (calibrated probability)
  "Given both views, what's my probability estimate?"
  Not a gut feel. A number. With reasoning attached.

Example decomposition:

"Will AI agents replace 40% of knowledge worker tasks by end 2027?"

View	Evidence	Adjustment
Outside	Historical automation waves displaced 20-30% of targeted tasks within 5 years of maturity. Base rate: 25%	Starting point: 25%
Inside	AI coding assistants already handle 30-50% of junior programming tasks (2025 data). Enterprise adoption at 65%+. Capability curve steeper than prior automation waves.	Adjust upward: +20%
Synthesis	Outside view says 25%. Inside view says this wave is faster. Split the difference with slight upward lean.	Probability: 70%

The discipline: never skip the outside view. The temptation is always to jump to domain expertise. The base rate grounds you.

State-of-the-World Protocol

A repeatable process for producing a multi-year forecast. Run quarterly or when major signals shift.

Step 1: Define Domains

Choose 6-8 domains that cover your decision space. Too few and you miss interaction effects. Too many and you lose depth.

Criterion	Good Domain	Bad Domain
Actionable	You make decisions affected by this	Interesting but irrelevant
Observable	You can track signals	Pure speculation
Bounded	Clear enough to decompose	"Everything about the economy"
Connected	Interacts with other domains	Isolated curiosity

Step 2: Per Domain — Base Rate + Signals + Trend

For each domain, fill this structure:

Field	What It Contains	Why
Base Rate	Historical precedent for this type of change at this speed	Grounds the outside view
Current Signals	3-5 concrete, sourced, date-stamped data points	Evidence, not narrative
Trend Direction	Accelerating / Steady / Decelerating / Reversing	Trajectory matters more than position

Signals must be facts, not interpretations. "Enterprise AI adoption at 67% (Gartner, Oct 2025)" is a signal. "AI is taking over" is not.

Step 3: Per Domain — Prediction + Probability

For each domain, write predictions that pass the SMART-BF test:

Specific — One measurable outcome
Measurable — Clear resolution criteria
Assignable — Who resolves it
Realistic — Within the plausible range
Time-bound — Resolution date
Base-rated — Outside view stated
Falsifiable — What would prove it wrong

Assign both:

Probability (0-100%) — Calibration instrument. A 70% prediction should be right 70% of the time.
Conviction (1-5) — Action instrument. How much would you bet? Maps to the existing priority system.

These are different instruments. Probability measures your calibration. Conviction measures your willingness to act.

Step 4: Falsifying Conditions + Watch Signals

For each prediction:

Field	Question
Falsifying conditions	What evidence in 6 months would lower your conviction by 2+ points?
Watch signals	What specific data do you check in weekly/monthly reviews?
Update triggers	At what threshold do you revise the probability?

This is where most forecasters fail. They make predictions but never define what would change their mind. Without falsifying conditions, a prediction is a belief, not a hypothesis.

Step 5: Cross-Domain Interactions

The highest-value predictions live at domain intersections. Map compounding effects:

Domain A  ──→  Domain B
   ↑              │
   └──────────────┘
   Reinforcing loop = acceleration

Ask: where do domains amplify each other? Where do they cancel? The interaction effects are where you find the predictions nobody else is making.

Step 6: Calibration Check

Compare your new predictions against your existing prediction database:

Do any new predictions contradict existing ones? If so, one must update.
Is your probability distribution realistic? (All 90%+ predictions = overconfidence. All 40-60% = hedging.)
Plot your predictions on a calibration curve. If you have history, check past accuracy.

Calibration

The measure of a forecaster is not whether individual predictions are right. It is whether their probability estimates are well-calibrated over time.

Calibration	Meaning
Perfect	70% of your 70% predictions come true
Overconfident	50% of your 70% predictions come true
Underconfident	90% of your 70% predictions come true
Uninformative	All predictions cluster around 50%

Track with Brier scores. Lower is better. 0 = perfect foresight. 0.25 = coin flip. Below 0.2 = good forecaster.

The feedback loop: predict, track, score, adjust process, predict again. This is the VVFL applied to belief.

Anti-Patterns

Trap	Symptom	Fix
Narrative bias	Predictions read like a story with a protagonist	Strip to data points and probabilities
Hedgehog thinking	One big idea explains everything	Force yourself to name three competing explanations
Recency bias	Last week's news dominates the forecast	Always start with the base rate, not the headline
Precision theater	"73.2% probability" with no calibration history	Round to nearest 5% until you have 50+ tracked predictions
Update failure	Predictions unchanged for 6+ months	Set calendar reminders for review cadence

Context

Forecasting — The principles (backward/forward reasoning, discipline framework)
Probability — Bayesian updating mechanics
Evaluation — SMART-BF scoring, Brier scores, calibration tracking
Process — The review cadence (daily, weekly, monthly, quarterly)
Prediction Database — The living record of all predictions

Questions

If decomposition is the superforecaster's primary weapon, which of your current predictions has never been decomposed into sub-questions?

When your outside view (base rate) and inside view (domain signals) conflict sharply, what decision rule do you use to weight them — and has that rule ever been tested?
What is the minimum number of tracked predictions required before your calibration curve becomes meaningful?
If you could only track one domain for the next two years, which would give you the highest decision-relevant signal — and what does that reveal about where your uncertainty actually lives?

The Ten Commandments​

Decomposition​

State-of-the-World Protocol​

Step 1: Define Domains​

Step 2: Per Domain — Base Rate + Signals + Trend​

Step 3: Per Domain — Prediction + Probability​

Step 4: Falsifying Conditions + Watch Signals​

Step 5: Cross-Domain Interactions​

Step 6: Calibration Check​

Calibration​

Anti-Patterns​

Context​

Links​

Questions​