Testing Tech
How do you prove the product delivers what you promised?
The Testing Trophy defines the strategy. These are the tools. Each layer proves something different — function at the bottom, value at the top.
L4 Commissioning → Agent-Browser (on-demand, production, proves VALUE)
L3 E2E → Playwright (CI-gated, test DB, proves FUNCTION)
L2 Integration → Vitest / Jest + real DB (proves WIRING)
L1 Unit → Vitest / Jest (proves LOGIC)
L0 Static → tsc --noEmit (proves CONTRACTS)
Two E2E Layers
L3 and L4 are both E2E but they prove different things.
| Dimension | L3 (Playwright) | L4 (Agent-Browser) |
|---|---|---|
| Proves | Code works — functions execute correctly | Value delivered — customer job gets done |
| Runs against | Test DB (port 5433), localhost | Production (dreamineering.com), real data |
| Triggered by | CI on every PR | Dream team on demand (/prd-commissioning) |
| Owned by | Engineering | Dream team (commissioner is never the builder) |
| Speed | 30-120s per test | Minutes per route (interactive) |
| Evidence | GREEN/RED in CI | Screenshots, Story Contract verification |
| Maps to | SPEC-MAP Test Status column | SPEC-MAP L-Level + Last Verified columns |
L3 proves the plumbing works. L4 proves the customer gets what was promised. Both are necessary. Neither substitutes for the other.
Selection Guide
| Scenario | Tool | Why |
|---|---|---|
| PR merge gate | Playwright (L3) | Fast, repeatable, catches regressions |
| Story Contract verification | Agent-Browser (L4) | Follows user journey against real data |
| Route existence check | Agent-Browser (L4) | Navigate + screenshot in seconds |
| Form wiring proof | Playwright (L3) | Needs controlled test data |
| Happy path golden journey | Both | L3 proves it works, L4 proves it matters |
Agent-Browser Commands
# Navigate and screenshot
agent-browser open https://dreamineering.com/crm/contacts
agent-browser wait --load networkidle
agent-browser screenshot contacts.png
# Interactive flow
agent-browser snapshot -i # Get element refs
agent-browser fill @e1 "Acme" # Search
agent-browser click @e3 # Click result
agent-browser screenshot detail.png
# Auth flow
agent-browser open https://dreamineering.com/sign-in
agent-browser snapshot -i
agent-browser fill @e3 "$EMAIL"
agent-browser fill @e5 "$PASSWORD"
agent-browser click @e6
agent-browser wait --url "**/dashboard"
Tool Selection
| Layer | Primary Tool | Proves | Nx Target |
|---|---|---|---|
| L0 Static | TypeScript compiler | Contracts match | typecheck |
| L1 Unit | Vitest / Jest | Logic correct | test-schema |
| L2 Integration | Vitest / Jest + real DB | Wiring correct | test-integration |
| L2 Browser | Vitest Browser Mode | Client rendering correct | test-integration |
| L2 Mocking | MSW | Network interception shared | (used within L1/L2) |
| L3 E2E | Playwright | Function proven (CI) | e2e |
| L4 Commission | Agent-Browser | Value delivered (production) | manual |
Story Contract Flow
Every Story Contract row generates tests at multiple layers. The SPEC-MAP traces the chain.
Story Contract S1: "User searches contacts, finds match in <5s"
│
├── L1: contactSearchSchema.test.ts → Schema validates search input
├── L2: searchContacts.integration.ts → Server action returns correct results
├── L3: crm-contacts.spec.ts → Browser renders list, search filters
└── L4: agent-browser /crm/contacts → 29 contacts render, search works on production
Each branch in this tree becomes a row in the SPEC-MAP. The SPEC-MAP format requires Story#, WHEN/THEN, Test Layer, and Test Status columns — see the handoff protocol for the full schema and conversion rules.
The FORBIDDEN column drives safety tests at the same layer. If S1 says "Contact detail shows contacts from another org" is forbidden, the L2 test must prove multi-tenant isolation — not just that search returns results.
Story Contract Mapping
Every Story Contract row in a PRD maps to a trophy layer through the SPEC-MAP. Engineering fills the Test Layer column at the spec-to-tests bookend.
| Story Row Pattern | Test Layer | File Convention | Why This Layer |
|---|---|---|---|
| Schema validates input shape | L1 | *.schema.spec.ts | Pure logic, no DB needed |
| Server action returns correct data | L2 | *.integration.spec.ts | Proves wiring through real DB |
| Server action enforces multi-tenant isolation | L2 | *.integration.spec.ts | Safety test — must hit real data |
| Browser renders list from server action | L3 | *.spec.ts (Playwright) | Only if L2 passes and browser wiring is the unknown |
| Customer completes job on production | L4 | manual (agent-browser) | Value verification, not code verification |
The selection rule: Start at L1. Move up only when the layer below cannot prove the assertion. If a Story Contract THEN clause references data correctness, that's L2. If it references what the user sees, check whether an L2 test on the server action covers it first — most "user sees X" stories are L2 tests wearing L3 clothes.
SPEC-MAP enforcement: No empty Test Layer cells at L3+. If a story row has no test layer assigned, it's a spec-bounce — engineering flags it before building.
SPEC-MAP Conversion Protocol
Existing SPEC-MAPs may use the old | Tier | Feature | Spec | Status | format. Convert to the new trophy-aware format using this 4-step protocol.
Inventory L2 Specs
# Find all integration specs for a domain
ls libs/app-server/app-drmg-sales-server/src/actions/*<domain>*.integration.spec.ts
These specs already prove data correctness (CRUD, search, validation) at L2 — they're invisible to old-format SPEC-MAPs that only list L3 Playwright specs.
Apply Selection Rule
For each old SPEC-MAP row, ask: "Does an L2 integration spec already prove this feature?"
| Old Row Pattern | New Test Layer | Why |
|---|---|---|
| Feature is data CRUD (create, read, update, delete) | L2 | Integration spec hits real DB, proves wiring |
| Feature is search/filter | L2 | Server action returns correct results |
| Feature is "page renders with data" | L3 (keep) | Browser rendering is the unknown |
| Feature is "form interaction" | L3 (keep) | Multi-step UI flow needs browser |
| Feature is schema validation | L1 | Pure transform, no DB or browser needed |
Reclassify Duplicates
Many stories have BOTH an L2 and L3 spec. Keep both rows — they prove different things:
- L2 row proves the data is correct (server action → DB → response)
- L3 row proves the browser renders it (DataTable loads, form submits)
A story that only has L3 coverage and the L3 test checks data values (not browser rendering) should be pushed down to L2.
Browser-Only Proof
L3 stays when the assertion requires a real DOM:
- DataTable renders with search, sort, pagination
- Form multi-step interaction (fill → validate → submit → redirect)
- Navigation flow (breadcrumbs, back links, mobile nav)
- Responsive layout at breakpoints
- Keyboard navigation and accessibility
New Format Schema
See PRD Handoff Protocol — SPEC-MAP for the full column schema:
| Story# | WHEN/THEN | Test Layer | Test File | Test Status | L-Level | Last Verified |
Each tree branch from the Story Contract Flow above becomes one SPEC-MAP row. Engineering fills Test Layer (L1/L2/L3) and Test File. Dream fills Story#, L-Level, and Last Verified during commissioning.
CRM Example
Old format (3 rows, all L3):
| — | Contact list + search | crm-contacts.authenticated.spec.ts | UNMAPPED |
| — | Contact CRUD | crm-contacts-crud.authenticated.spec.ts | UNMAPPED |
| — | Contact edit flow | crm-contact-edit.authenticated.spec.ts | UNMAPPED |
New format (8 rows, 5 L2 + 3 L3):
| — | WHEN user creates contact ... THEN persists | L2 | crm-contacts.actions.integration.spec.ts | UNMAPPED |
| — | WHEN user retrieves contact ... THEN returned | L2 | crm-contacts.actions.integration.spec.ts | UNMAPPED |
| — | WHEN user updates contact ... THEN persists | L2 | crm-contacts.actions.integration.spec.ts | UNMAPPED |
| — | WHEN user deletes contact ... THEN soft-delete | L2 | crm-contacts.actions.integration.spec.ts | UNMAPPED |
| — | WHEN user searches contacts ... THEN filtered | L2 | crm-contacts.actions.integration.spec.ts | UNMAPPED |
| — | WHEN user opens contacts page THEN DataTable | L3 | crm-contacts.authenticated.spec.ts | UNMAPPED |
| — | WHEN user fills create form THEN submits | L3 | crm-contacts-crud.authenticated.spec.ts | UNMAPPED |
| — | WHEN user edits via form THEN values shown | L3 | crm-contact-edit.authenticated.spec.ts | UNMAPPED |
5 stories pushed from "invisible" to L2 visibility. 3 L3 stories retained for browser-specific proof.
Dig Deeper
- Vitest — Primary test runner: L1 unit, L2 integration, L2 browser mode. Nx setup, file naming conventions, MSW integration
- Jest — Current runner in the monorepo. Migration path to Vitest documented
- React Testing Library — Component testing that mirrors how users interact with UI
- MSW — Mock Service Worker: shared mocking language across L1, L2, and browser tests
Smart Contracts
Smart contract testing is a separate domain with different tools and economics.
Security Audit is vital — fast, best practice contract testing is extremely valuable.
Context
- Testing Platform — Trophy strategy, economics, Story Contract connection
- Testing Strategy — Layer model, selection rules, recovery backlog
- Dev Workflow — Build stream and fix stream
- SPEC-MAP — Traceability from Story Contract to test file to commissioning
- Validate Outcomes — L0-L4 maturity model
Questions
What does your test suite prove — that the code works, or that it delivers value?
- If L3 passes but L4 shows a 404, which layer is lying?
- When a Story Contract FORBIDDEN outcome has no test at any layer, how do you know it can't happen?
- If you had to choose between 100% L1 coverage or 5 L4 commissioning runs per week, which proves more?
- What's the cost of a test suite where every spec passes but the customer can't complete the job?