Code That Lasts
What separates code that compounds from code that corrodes?
Not talent. Not tools. Discipline engineered into the process so that good patterns are the path of least resistance and bad patterns are actively difficult.
The system must make bad code harder to write than good code. When a team achieves this, quality stops depending on memory and starts depending on structure.
Pre-Coding Card
Before a single line is written, five questions must have answers. Not rough answers. Specific, falsifiable, written answers.
| Question | What it forces |
|---|---|
| What problem are you actually solving? | Eliminates solutions to the wrong problem |
| What already exists that handles this? | Stops reinvention, reveals real gaps |
| What does done look like — specifically? | Makes acceptance testable, not subjective |
| What do you not yet know? | Surfaces the unknowns before they become surprises |
| What does failure look like? | Makes the blast radius visible before the fuse is lit |
A team that skips this card ships fast and reworks often. A team that completes it ships slower at first and accelerates from there.
Types First
Invalid states should be unrepresentable. This is not an ideal. It is a discipline.
When you model a domain in types before writing any logic, two things happen. First, you discover every edge case the type system forces you to handle. Second, bugs that would have appeared at runtime appear at compile time, where they are cheap.
The contract comes before the implementation:
// Wrong order: implement, then figure out the types
function process(data: any): any { ... }
// Right order: define the contract, then implement
type ValidOrder = { id: OrderId; items: NonEmptyArray<LineItem>; status: "pending" }
type ProcessResult = { success: true; orderId: OrderId } | { success: false; reason: ProcessError }
function process(order: ValidOrder): ProcessResult { ... }
The second version cannot be called with invalid data. The first can be called with anything and will fail at runtime in production.
Rules:
- Define input type, output type, and invalid states before writing any function body
- Use discriminated unions for error handling — no throwing strings
- Never use
anyin domain or application layers
Pure Core
The domain of a system should be pure functions. No database calls. No HTTP requests. No filesystem access. No side effects of any kind.
Pure functions are trivially testable. They take input and return output. They can be called a thousand times with the same arguments and produce the same result. They compose without surprising you.
Impure operations belong in a shell around the pure core. The shell handles I/O. The core handles logic. They meet at a boundary — an explicit port — and that boundary is where you write most of your tests.
┌─────────────────────────────────────────┐
│ SHELL (impure) │
│ Database, HTTP, File System, Timers │
│ │
│ ┌────────────────────────────────┐ │
│ │ CORE (pure) │ │
│ │ Business rules, calculations, │ │
│ │ transformations, validation │ │
│ └────────────────────────────────┘ │
│ │
│ Port: explicit contract between them │
└─────────────────────────────────────────┘
When your domain leaks I/O, every test needs a database. When I/O leaks into your domain, the business rules are untestable in isolation. Neither failure is obvious until the test suite takes 20 minutes to run.
Ports and Adapters
Dependencies point inward. The domain knows nothing about the infrastructure that serves it. This is the hex rule.
Infrastructure (databases, APIs, UIs)
│
▼ implements
Port (interface defined by domain)
│
▼ calls
Domain (pure business logic)
An adapter translates between the external world and the port contract. Swap the database? Write a new adapter. The domain does not change. The tests do not change. Only the adapter changes.
When infrastructure calls domain, you have inverted the dependency. The domain now needs the infrastructure to exist before it can be tested. A change to the database schema breaks the domain tests. This is the failure mode that causes codebases to become untestable over time.
Pipelines
Functions that do one thing compose. Functions that do many things break.
Build pipelines — sequences of single-purpose transformations — rather than god-functions that know everything and do everything.
// God function: hard to test, hard to change, hard to reason about
async function handleOrder(orderId: string) {
const order = await db.orders.find(orderId)
const customer = await db.customers.find(order.customerId)
const discount = calculateDiscount(customer, order)
const total = applyDiscount(order.total, discount)
await db.orders.update(orderId, { total, discountApplied: discount })
await email.send(customer.email, buildReceipt(order, total))
await analytics.track("order_processed", { orderId, total })
}
// Pipeline: each step is a pure function, testable independently
const processOrder = pipe(
validateOrder,
calculateDiscount,
applyDiscount,
buildReceipt
)
// I/O at the edges only
Every pipeline step is testable in isolation. The composition is testable with stubs at the I/O points. The god function requires the database, the email service, and the analytics platform to run even one test.
Composition
Inheritance couples. Composition decouples.
When a child class inherits from a parent, every change to the parent risks breaking every child. The coupling is invisible until it breaks. The blast radius is impossible to predict.
When you compose behaviours from small, focused functions and types, change is local. Adding a behaviour means adding a function. Removing a behaviour means removing a function. The rest of the system does not notice.
Prefer:
- Functions over classes where state is not needed
- Composition of small functions over inheritance hierarchies
- Explicit data transformations over shared mutable state
- Interfaces over base classes
The discipline is not about ideology. It is about the test suite staying fast and the blast radius of changes staying small.
Testing the Contract
Tests are the living documentation of what the contract is. They are not evidence that the code runs — they are evidence that it does what it claims.
Write tests first. Not as ceremony. As the act of thinking through what the function should do before you decide how to implement it.
| Test tier | What it proves | When to use |
|---|---|---|
| Unit | Pure function contracts, type correctness, edge cases | Domain and application layers |
| Integration | Port adapters talk correctly to infrastructure | Infrastructure layer |
| End-to-end | Full journey from user intent to observable outcome | Critical paths only |
Rules:
- Unit tests: in-memory, no I/O, run in under 1ms each
- Integration tests: hit real infrastructure (not mocks), prove the adapter works
- End-to-end tests: prove the system works together; expensive, so minimal
- Property-based tests: generate inputs automatically to find the edge cases you did not think of
Mock at ports, not inside implementations. A test that mocks a database inside the domain is testing the mock, not the domain.
Evidence Only
Declare done with evidence, not assertions.
"It works" is not evidence. "10 test files pass, covering stories S1–S4, with the E2E test running against the production database schema" is evidence.
Every completed piece of work should produce a named artifact that proves the contract was met. No artifact, no done.
| Claim | Evidence required |
|---|---|
| "Function handles errors correctly" | Unit test with every failure mode |
| "API contract is stable" | Integration test against real endpoint |
| "Performance target met" | Benchmark result with baseline comparison |
| "Scope complete" | Receipts referencing story IDs |
When the team defaults to assertions over evidence, drift accumulates invisibly. A quarterly audit finds 40% of "done" items that are partial, stale, or never validated. The solution is not more audits — it is a culture of evidence at every completion.
Context
- Type-First Development — Types before logic: the discipline in practice
- Hexagonal Architecture — Ports, adapters, and dependency inversion
- Engineering Quality Benchmarks — Thresholds that define "healthy"
- Engineering Anti-Patterns — What the failure modes look like
- Flow Engineering — The workflow that sustains quality under speed
- Standards Index — The full standards library
Links
- Functional Core, Imperative Shell — Gary Bernhardt's canonical explanation
- Parse, don't validate — Alexis King on types-first correctness
Questions
What would your codebase look like if bad code was structurally harder to write than good code?
- Which of these disciplines would catch the most regressions in your current system?
- If you drew the boundary between your pure core and your impure shell, where would it be — and how much logic currently lives on the wrong side?
- What does "done" mean in your team right now, and what evidence do you require before using that word?
- Where in your test suite do you mock infrastructure instead of testing it — and what real failures does that hide?