AI Products

What changes when your product thinks?

Traditional products are deterministic. Same input, same output. Test it once, ship it. AI products produce a distribution of outcomes — the same input generates different outputs every time. This changes everything about how you build, test, and ship.

The gap between "AI demo" and "AI product" is an evaluation gap. Demos impress with best-case outputs. Products must handle the full distribution — including the tail where things go wrong.

The Shift

Deterministic Product	AI Product
Bug = broken code	Bad output = expected variance
Test once, ship	Evaluate continuously
Binary: works or doesn't	Spectrum: how often, how good
Reproduce every issue	Some failures are statistical
100% quality possible	Quality = acceptable distribution
Spec defines behavior	Spec defines boundaries

AI Product Tight Five

The same five questions applied to building with AI:

#	Question	AI Product Translation	Where
1	Why does this matter?	What job does the AI do that wasn't possible before?	Principles
2	What truths guide you?	What does "good" mean for this output?	Principles
3	What do you control?	What can you measure, test, and improve?	Evaluation
4	What do you see?	Where is the model failing that users haven't reported yet?	Observability
5	How do you know?	Are eval scores improving AND users happier?	Observability

The Loop

The VVFL applied to AI products:

DEFINE "GOOD" → BUILD EVALS → SHIP → MEASURE → LEARN → REDEFINE "GOOD"
      ↑                                                        |
      └────────────────────────────────────────────────────────┘

Every cycle tightens the distribution. Quality isn't a destination — it's a feedback loop.

Stage	Activity	Output
Define	Set quality principles	Dimensions, rubrics, failure budgets
Build	Write requirements	AI PRD with eval criteria
Measure	Run evaluations	Scores across golden dataset
See	Analyze traces	Where it fails, why, how often
Learn	Close the gap	Tighter prompts, better data, updated evals

The Business Loop: Agency to VSaaS

Building an AI product isn't just an engineering task—it's an economic one. We use the AI-Native Agency model to validate the product before scaling it as Vertical SaaS (VSaaS).

Agency Phase: Use the AI product as an internal tool. Humans handle 50-60% of the work. Validate the "Good" definition with paying clients.
Productized Phase: Automate the workflows. Target 90% AI production. Human QA handles the remaining 10% (the "Left Tail" of the distribution).
VSaaS Phase: Ship the tool to the industry. Shift from outcome-based pricing to subscription-based recurring revenue.

Work Chart

Who does what in AI product development vs. an AI-Native Agency?

Activity	Human Role	AI Role	AI % (Dev)	AI % (Agency)
Define quality	Sets dimensions, judges edge cases	Generates rubric variations	25%	10%
Build golden datasets	Curates, validates, tags	Generates synthetic examples	50%	90%
Write eval rubrics	Defines scoring criteria	Scores outputs against rubric	60%	95%
Trace analysis	Pattern recognition, root cause	Surfaces anomalies, clusters failures	45%	85%
Production	Final judgment, "Taste"	Drafts, researches, formats	N/A	90%

Aggregate AI %: 42% (Dev) / 80%+ (Agency) — The goal of an AI product is to shift the production burden from humans to the model, enabling software-like margins in a service world.

Context

VVFL Loop — The feedback loop everything builds on
Prediction Evaluation — The SMART-BF pattern this section extends
Work Charts — Human/AI capability mapping
Jobs To Be Done — What job is the AI hired for?
AI Frameworks — The infrastructure layer beneath products
Product Design — Design principles that still apply

AI Products

The Shift

AI Product Tight Five

The Loop

The Business Loop: Agency to VSaaS

Work Chart

Subjects

📄️ Principles

📄️ Evaluation

📄️ Observability

Context

The Shift​

AI Product Tight Five​

The Loop​

The Business Loop: Agency to VSaaS​

Work Chart​

Subjects​

📄️ Principles

📄️ Evaluation

📄️ Observability

Context​

The Shift

AI Product Tight Five

The Loop

The Business Loop: Agency to VSaaS

Work Chart

Subjects

Context