Prediction Evaluation
Is this prediction worth tracking?
Not all predictions deserve attention. This checklist separates signal from noise by scoring prediction quality before you invest time tracking it.
The SMART-BF Checklist
Seven dimensions, scored 0–2 each. Total: 0–14 points. Every cell intersection carries meaning — this is a genuine scoring rubric, not a layout table.
| Dimension | Question | 0 | 1 | 2 |
|---|---|---|---|---|
| Specific | Is it precise and unambiguous? | Vague ("AI will change things") | Somewhat specific | Precise ("GDP-Val over 90% by Dec 2026") |
| Measurable | Can we objectively verify resolution? | No clear verification | Partially measurable | Binary yes/no with clear criteria |
| Actionable | Does it enable positioning decisions? | Entertainment only | Indirect implications | Direct action if true/false |
| Resolution | Is there a clear time horizon? | No timeframe | Vague ("soon", "eventually") | Specific date or trigger event |
| Testable | What would prove it wrong? | Unfalsifiable | Weak falsification criteria | Clear falsifying conditions |
| Base rate | Is there historical precedent? | No analogous history | Weak analogies | Strong base rate available |
| Factored | Does it depend on other predictions? | Many hidden dependencies | Some dependencies acknowledged | Independent or dependencies explicit |
Scoring Guide
- 10–14 — Excellent — Track actively, assign conviction, position
- 7–9 — Good — Track, but note quality gaps
- 4–6 — Marginal — Improve specificity before tracking
- 0–3 — Poor — Don't track — reframe or discard
Quality → Conviction Mapping
High-quality prediction is not the same as high-conviction prediction.
- Quality — how well-formed is the prediction itself?
- Conviction — how likely you think it is to occur?
A prediction can score 14/14 on quality ("Bitcoin hits 200K USD by Dec 31, 2026") while you have low conviction (1/5) it will happen.
- Quality 10–14 — full conviction range eligible (0–5)
- Quality 7–9 — cap at 4/5 (quality uncertainty constrains the bet)
- Quality 4–6 — cap at 3/5 (prediction itself is unclear)
- Quality 0–3 — don't assign conviction at all
Worked Example
Prediction: "AI solves at least one Clay Millennium Prize math problem in 2026"
| Dimension | Score | Reasoning |
|---|---|---|
| Specific | 2 | Clear outcome (one of 7 named problems) |
| Measurable | 2 | Clay Institute verification process exists |
| Actionable | 1 | Indirect positioning implications |
| Resolution | 2 | "In 2026" = by Dec 31, 2026 |
| Testable | 2 | No solution announced = falsified |
| Base rate | 1 | No prior AI math proof at this level |
| Factored | 1 | Depends on AI capability trajectory |
Total: 11/14 — Excellent quality, worth tracking.
Conviction assignment: 3/5 (uncertain on timeline, confident on direction)
Common Quality Failures
Vague predictions (low Specificity)
- "AI will transform business" → Better: "50% of Fortune 500 will have AI-native divisions by 2027"
- "Crypto will go mainstream" → Better: "US spot Bitcoin ETFs exceed 100B USD AUM by Dec 2026"
Unfalsifiable predictions (low Testability)
- "We're in the early innings of AI" → Better: "Frontier Math Tier 4 exceeds 40% by Dec 2026"
- "The future belongs to builders" → Better: "Single-founder billion-dollar startup emerges by 2027"
Missing base rates (low Base rate)
- "AGI by 2027" → Add: "Based on GPT-2 to GPT-4 capability doubling timeline of around 2 years"
- "10x efficiency gains" → Add: "Manufacturing automation precedent: 8 to 12x over 20 years"
Hidden dependencies (low Factored)
- "Level-5 autonomy deployed in 2026" → Add: "Depends on regulatory approval, liability framework, OEM adoption"
The Inversion Test
Before scoring, ask: What would make this prediction worse?
If the answer includes any of these patterns, you have already named the quality gap:
- "Be more specific" → Specificity problem
- "Define success" → Measurability problem
- "Pick a date" → Resolution problem
- "Acknowledge what could prove it wrong" → Testability problem
Using This Checklist
- Before adding to the live forecast — score quality first
- When reviewing others' predictions — apply checklist before forming conviction
- When your conviction changes — check whether quality score also changed; new information may mean reframing the prediction, not just adjusting probability
Context
- Prediction Process — The five questions for every prediction
- Superforecasting — Build the discipline
- Probability — Size bets correctly
- Live Forecast — Track what matters
Questions
Which aspect of this topic compounds most over a 10-year horizon when practiced consistently versus ignored?
- At what level of mastery does this topic shift from requiring deliberate effort to becoming an automatic advantage?
- How does this topic change when the context shifts from individual practice to organizational culture?
- Which assumption about this topic is most commonly held that, if examined, would change how you approach it?