Skip to main content

Software Decisions

Software's job is to help people solve problems that create real world value.

Understand the job to be done to determine required functions and desired outcomes.

Happy customers pay the bills

Checklist

The quality of your decisions depends on the quality of your questions.

1. Problem

What problem are we solving?

  • What's the core job to be done?
  • What does success look like in 6 months? 2 years?
  • What type of system is this? (CRUD, real-time, batch, analytics, offline-first)

2. Constraints

What narrows the field?

  • Regulatory requirements? (PII, compliance, data residency)
  • Budget ceiling?
  • Timeline to production?
  • What must it integrate with?

3. Requirements

What outcomes define success?

  • Non-functional priorities ranked: latency, throughput, consistency, availability, security
  • Expected scale: users, requests/sec, data volume
  • Who owns the data? Who owns the intelligence?

4. Options

What exists?

  • What's the proven pattern for this class of problem?
  • Who else has solved this? What did they use?
  • What are the 2-3 viable candidates?

5. Team Fit

Can we execute?

  • Do we have the skills? Realistic learning curve?
  • How hard is hiring for this stack?
  • Community health: active releases, clear roadmap, low bus factor?
  • Tooling: IDE support, testing frameworks, CI/CD story?

6. Economics

Total cost of ownership?

  • Infrastructure + licenses + support at scale?
  • Development velocity impact?
  • Maintenance burden over time?
  • Lock-in risk? Exit cost if we need to move?

7. Risk

What happens when things go wrong?

  • Failure modes: partial outages, network splits, dependency failures?
  • Backup/restore, migrations, rollback patterns?
  • Security posture: auth, encryption, patch cadence?

8. Developer Experience

How easy to fall off the happy path?

  • Can you run tests without hitting external APIs?
  • Does CI pass on first clone? (env setup, service deps, auth tokens)
  • What fails first — and does it fail fast with a clear message?
  • Can you replicate prod config in staging without manual sync?
  • What's the cold-start time for a new developer to run the full test suite?

This is the gate most teams skip. A tool that scores well on features but breaks in CI costs more than one with fewer features that tests cleanly. See Identity Auth evaluation for a worked example.

9. Validation

Score commitment

  • Can we spike the core flow in a day?
  • How painful is migration if we're wrong in 12 months?
  • Write the decision record: options, trade-offs, why this choice now

LLM Selection

Which model for which job? Apply situational wisdom — state of task + state of constraints = model choice.

State of Task

Cognitive DemandExamplesWhat You Need
Deep reasoningArchitecture, complex refactors, multi-file changesHighest capability, thinking tokens
Broad analysisRepo-wide audits, legacy migration, pattern searchMassive context window
Feature buildingNew endpoints, components, test suitesBalanced reasoning + speed
Quick fixesTypos, simple generation, formattingSpeed and cost efficiency
VerificationCode review, second opinion, fact-checkingDifferent model family (avoid marking own homework)

State of Constraints

ConstraintFavoursAvoids
Budget-limitedFree tiers (Gemini), cheap models (Haiku, Flash)Opus for commodity tasks
Time-pressuredFast models, parallel agentsSequential deep reasoning
Context-heavyLarge context windows (Gemini 1M+)Small-window models on big repos
Accuracy-criticalBest available + verification agentFast models without review
ExploratoryCheap models with high throughputExpensive models for brainstorming

Model Decision Matrix

TaskPrimaryWhyFallback
Architecture, complex planningClaude OpusDeepest reasoning, thinking tokenso3 for mathematical/logical problems
Feature implementationClaude SonnetBalanced speed/quality, tool integrationGPT-4o
Repo-wide analysisGemini Pro1M+ token context, free tierClaude with chunked context
Quick fixes, batch tasksClaude Haiku / Gemini FlashSpeed, costSonnet if quality insufficient
Code reviewDifferent family than writerAvoid confirmation biasIf Claude wrote it, review with Gemini or GPT
Content generationClaude (any tier) + voice agentsBest at following complex voice instructions
UI prototypingv0.dev / CursorVisual feedback loop, component focusClaude with frontend-design skill

The Anti-Pattern

Using the best model for everything. Opus for a typo fix wastes money. Haiku for architecture wastes time. Match capability to demand.

TASK COMPLEXITY × REVERSIBILITY = MODEL TIER

High complexity + irreversible → Best available (Opus, o3)
High complexity + reversible → Mid-tier (Sonnet, GPT-4o)
Low complexity + any → Fast/cheap (Haiku, Flash)

Cross-Model Verification

The strongest pattern: write with one model family, verify with another. Models share training biases within families. A Claude review of Claude code catches fewer issues than a Gemini review of Claude code.

See Config Architecture for how we configure multiple agents on the same codebase without duplicating context.


Context

Radar

Questions

How do you make a tech decision that's still right in 12 months?

  • When does the workaround count cross the threshold from "live with it" to "migrate"?
  • What's the difference between a reversible technology choice and an irreversible one — and how should the decision process differ?
  • If the DX validation gate (step 8) catches most real-world pain, why do teams skip it?