Prediction Evaluation

Is this prediction worth tracking?

Not all predictions deserve attention. This checklist separates signal from noise by scoring prediction quality before you invest time tracking it.

The SMART-BF Checklist

Seven dimensions, scored 0–2 each. Total: 0–14 points. Every cell intersection carries meaning — this is a genuine scoring rubric, not a layout table.

Dimension	Question	0	1	2
Specific	Is it precise and unambiguous?	Vague ("AI will change things")	Somewhat specific	Precise ("GDP-Val over 90% by Dec 2026")
Measurable	Can we objectively verify resolution?	No clear verification	Partially measurable	Binary yes/no with clear criteria
Actionable	Does it enable positioning decisions?	Entertainment only	Indirect implications	Direct action if true/false
Resolution	Is there a clear time horizon?	No timeframe	Vague ("soon", "eventually")	Specific date or trigger event
Testable	What would prove it wrong?	Unfalsifiable	Weak falsification criteria	Clear falsifying conditions
Base rate	Is there historical precedent?	No analogous history	Weak analogies	Strong base rate available
Factored	Does it depend on other predictions?	Many hidden dependencies	Some dependencies acknowledged	Independent or dependencies explicit

Scoring Guide

10–14 — Excellent — Track actively, assign conviction, position
7–9 — Good — Track, but note quality gaps
4–6 — Marginal — Improve specificity before tracking
0–3 — Poor — Don't track — reframe or discard

Quality → Conviction Mapping

High-quality prediction is not the same as high-conviction prediction.

Quality — how well-formed is the prediction itself?
Conviction — how likely you think it is to occur?

A prediction can score 14/14 on quality ("Bitcoin hits 200K USD by Dec 31, 2026") while you have low conviction (1/5) it will happen.

Quality 10–14 — full conviction range eligible (0–5)
Quality 7–9 — cap at 4/5 (quality uncertainty constrains the bet)
Quality 4–6 — cap at 3/5 (prediction itself is unclear)
Quality 0–3 — don't assign conviction at all

Worked Example

Prediction: "AI solves at least one Clay Millennium Prize math problem in 2026"

Dimension	Score	Reasoning
Specific	2	Clear outcome (one of 7 named problems)
Measurable	2	Clay Institute verification process exists
Actionable	1	Indirect positioning implications
Resolution	2	"In 2026" = by Dec 31, 2026
Testable	2	No solution announced = falsified
Base rate	1	No prior AI math proof at this level
Factored	1	Depends on AI capability trajectory

Total: 11/14 — Excellent quality, worth tracking.

Conviction assignment: 3/5 (uncertain on timeline, confident on direction)

Common Quality Failures

Vague predictions (low Specificity)

"AI will transform business" → Better: "50% of Fortune 500 will have AI-native divisions by 2027"
"Crypto will go mainstream" → Better: "US spot Bitcoin ETFs exceed 100B USD AUM by Dec 2026"

Unfalsifiable predictions (low Testability)

"We're in the early innings of AI" → Better: "Frontier Math Tier 4 exceeds 40% by Dec 2026"
"The future belongs to builders" → Better: "Single-founder billion-dollar startup emerges by 2027"

Missing base rates (low Base rate)

"AGI by 2027" → Add: "Based on GPT-2 to GPT-4 capability doubling timeline of around 2 years"
"10x efficiency gains" → Add: "Manufacturing automation precedent: 8 to 12x over 20 years"

Hidden dependencies (low Factored)

"Level-5 autonomy deployed in 2026" → Add: "Depends on regulatory approval, liability framework, OEM adoption"

The Inversion Test

Before scoring, ask: What would make this prediction worse?

If the answer includes any of these patterns, you have already named the quality gap:

"Be more specific" → Specificity problem
"Define success" → Measurability problem
"Pick a date" → Resolution problem
"Acknowledge what could prove it wrong" → Testability problem

Using This Checklist

Before adding to the live forecast — score quality first
When reviewing others' predictions — apply checklist before forming conviction
When your conviction changes — check whether quality score also changed; new information may mean reframing the prediction, not just adjusting probability

Context

Prediction Process — The five questions for every prediction
Superforecasting — Build the discipline
Probability — Size bets correctly
Live Forecast — Track what matters

Questions

Which aspect of this topic compounds most over a 10-year horizon when practiced consistently versus ignored?

At what level of mastery does this topic shift from requiring deliberate effort to becoming an automatic advantage?
How does this topic change when the context shifts from individual practice to organizational culture?
Which assumption about this topic is most commonly held that, if examined, would change how you approach it?

The SMART-BF Checklist​

Scoring Guide​

Quality → Conviction Mapping​

Worked Example​

Common Quality Failures​

Vague predictions (low Specificity)​

Unfalsifiable predictions (low Testability)​

Missing base rates (low Base rate)​

Hidden dependencies (low Factored)​

The Inversion Test​

Using This Checklist​

Context​

Questions​