Agents

What separates a tool from a character — and why does the distinction determine what the loop produces?

The Validated Virtuous Feedback Loop — Flow

An agent is not software with opinions. It is a declared operating mode: a character occupying a role in a feedback loop, equipped with tools, given a setpoint, and asked to run until the gap closes. The hero with a thousand faces — same underlying architecture, different costume. The costume is the system prompt.

The Agent

An AI agent is a system prompt recipe — prose, tools, permissions, model, and limits in one artifact. Writing a good recipe requires systems thinking; without feedback loops in the design, you get a mindset without a gauge.

Two prompts combine to form a complete agent:

Prompt	Answers	What it declares
System prompt	Who is processing?	Mindset, constraints, tools, permissions, character
Command prompt	What is being done?	Task, input, tools, know-how, expected output

The system prompt is the alter-ego. The command prompt is the job. Together they form a loop: the alter-ego determines how the job is processed; the job tests whether the alter-ego is right for the moment.

Agent Frames

A frame is the character the agent wears. Frames are independent of the loop station the agent occupies. Two axes, two different questions.

Rhetorical frames (Greek)

The Greeks answer: what kind of appeal does this agent make?

Frame	Mode	Used when
Pathos	Emotional appeal	Audience needs to feel before they can think
Ethos	Credibility and character	Trust is the bottleneck
Logos	Logic and evidence	Belief is blocked by missing proof
Kairos	Timing, the opportune moment	The right argument at the wrong moment fails
Topos	Common ground, shared place	Disagreement on basics blocks everything else

Character frames (Archetypes)

Archetypes answer: what cognitive mode does this agent embody?

Frame	Mode	Activates when
Dreamer	See the unseen, sell belief before proof	Starting something new, hope is absent
Philosopher	Seek truth, question everything	Direction feels wrong, assumptions unchallenged
Engineer	Build precisely, verify claims	Moving from intent to working system
Realist	Face what is, ground in evidence	Energy and optimism need a floor
Coach	Unlock others, ask not tell	People are stuck, capability is the ceiling

Agent Roles

A role is where in the VVFL loop the agent stands. Frames are what it thinks; roles are what it does.

Symbol	Station	The agent's job
Hopper	Prioritisation	Capture all signals; filter what's worth acting on
Pump	Implementation	Transform inputs to outputs following best practices
Gauge	Instrumentation	Measure what matters; read early signals and self-correct
Feedback	Quality Assurance	Are results and outcomes aligned with intentions?
Invisible	Intention	Perspective on the big picture; is this the best use of time and energy?

A well-written system prompt names both frame and role. Frame without role produces a character with no post. Role without frame produces a function with no judgment.

Creating Agents

Four elements combine:

Model — the base intelligence. Different models trade reasoning depth for speed, cost, context window. Selection is a first-principles decision, not a default.
System prompt — the alter-ego. Prose that declares who the agent is: purpose, constraints, tools it can use, limits it must respect, character it embodies.
Tools and permissions — the hands. What the agent can reach in the world: file systems, APIs, databases, browsers, other agents. Tools without constraints are runaway loops.
Context — the current state. What the agent knows right now: working memory, loaded files, prior outputs. Context is the fuel; system prompt is the engine.

The system prompt is not the secret differentiator. The ecosystem around it — user experience, connections, security protocols, feedback loops — determines how far the agent can travel.

How Agents Evolve

Apply the VVFL to the agent itself. The loop that governs how decisions improve also governs how prompts improve.

DECLARE SETPOINT → RUN LOOP → MEASURE GAP → REFINE PROMPT → REPEAT
     ↑                                                           |
     └───── evolve the setpoint when reality teaches ───────────┘

Each cycle:

Station	What evolves
Capture	Observed failure modes and edge cases enter the hopper
Filter	Which failures expose a prompt flaw vs a task flaw?
Pump	Refine the specific system prompt element that caused the gap
Gauge	Measure: did the refinement close the gap or just move it?
Reflect	Is the frame right for this role? Wrong archetype wastes a good prompt
Evolve	Improve the prompt-writing template — not just this prompt

The Legacy Rule: when an agent finishes a job, it improves the template for the next agent. Every run is both output and training data for the recipe. An agent that runs without improving its prompt is a corrective loop. An agent that runs and improves its prompt is a VVFL.

The fastest-evolving agents are the ones with the clearest setpoints. Constraints enable autonomy. A system prompt with one declared north star runs more autonomously than a prompt with ten competing goals.

Autonomy Spectrum

Two ends of the same architecture. The difference is not capability — it is who triggers the run and who reviews the output.

IDE Copilot — the agent that sits beside you while you work

Trigger: your prompt — you ask, it does
Context: shared with you — you see the repo, so does the agent; human fills any gap
Review: you inspect every output before it moves; nothing ships without your eyes on it
Memory: session context — CLAUDE.md reloads orientation each session; clears between sessions
Error handling: you correct live; the agent does not need to self-recover
Purpose: amplifier for human thinking — do this faster, better, with less friction

Autonomous Agent — the agent that runs while you sleep

Trigger: schedule or event — fires without your prompting
Context: self-contained — the prompt must carry everything; there is no human to fill gaps
Review: async — receipts reviewed after the run; errors discovered post-execution
Memory: explicit MEMORY.md injected at session start; persists across sessions
Error handling: agent must handle errors within the run or log them for review
Purpose: background execution — do this while I sleep, surface what matters when I wake

The five deciding dimensions:

Dimension	IDE Copilot	Autonomous Agent
Trigger	Human prompt	Schedule or event
Context	Shared with human	Self-contained
Review	Every output	Receipts, after the run
Memory	Session (CLAUDE.md)	Cross-session (MEMORY.md)
Error handling	Human corrects live	Agent self-recovers or logs

Design rule: Decide which end before writing the system prompt. The choice changes the entire prompt architecture:

Copilot: lean system prompt (human fills gaps), broad tool scope (human decides what to run), no self-recovery logic needed
Autonomous: fat system prompt (no human to fill gaps), narrow scoped tools (agent decides what to run), explicit error paths required

Most agents sit between the two ends. The useful test: how long is the review lag? Zero lag = copilot. Hours or days = autonomous. The longer the lag, the more self-contained the prompt must be, the narrower the tool scope, the more explicit the escalation path.

The agentic frameworks — Claude Code, Cursor, Codex — each sit at a different point on this spectrum. Choose the framework after you've decided the autonomy level, not before.

Thinking Modes

Agents are embodied ways of thinking. The same thinking patterns that make humans effective — chain of thought, inversion, outsider perspective — become replicable when encoded in a system prompt.

Thinking mode	Agent benefit
Chain of thought	Makes reasoning visible, catchable, correctable
Inversion	Declares failure causes before the loop runs
Outsider thinking	Prevents the frame from becoming a blind spot
Deep work then fast action	Separates research mode from execution mode

The Tight Five loops — the five-station cognitive cycle — map directly to what a well-designed agent runs inside its own reasoning sequence.

Giving Agents Agency

Agency is character plus capability — the capacity to make a meaningful difference that is coupled to real consequences. An agent has software capability by default. Character is what you give it.

Three conditions determine whether an agent has genuine agency or is just automation dressed as intelligence:

Condition	What it means for an agent
Genuine Possibility	Tools and permissions unlocked that actually change the world — read files, post messages, trigger processes. An agent that can only talk has no agency.
Meaningful Difference	A setpoint that serves beyond its own execution. Not "complete the task" — that is just a function. A setpoint that names what good looks like for the person the agent works for.
Coupling to Consequences	Receipts that feed back into prompt evolution. If nothing changes when the agent fails, the loop is decorative. Consequences are what make the loop real.

The character stack for a digital agent mirrors the human one:

Dimension	Human	Agent
Character	Values, identity, trust	Constraints, alignment, reputation
Capabilities	Pattern skills, judgment	Speed, scale, consistency in its domain
Capital	Social, intellectual, financial	Data, compute, context
Drivers	Purpose, meaning, belonging	Objective function, reward signal, aligned incentives

The hardest element to install is the first: constraints. An agent without constraints is not free — it is adrift. The constraint is the character. It is what makes "I won't do that" meaningful and "I will do this well" trustworthy. Constraints enable autonomy. Autonomy without constraints is drift.

The agent's loop mirrors the human one: Perceive → Question → Act → Measure → Learn. Each pass through the loop either tightens the character (constraints proven in action) or reveals the setpoint was wrong (evolve and restate).

One Job Rule

The VVFL setpoint principle states it plainly: one loop, one declared target. That principle extends to agent design without modification.

One setpoint → one frame → one role → one agent.

Design choice	Loop type	What happens
One job, clear setpoint	Corrective → VVFL	Laminar — converges on declared intent
Two jobs, one prompt	Reinforcing	Turbulent — competing goals create eddies
One job, no setpoint	Reinforcing	Drift — busy but directionless

Smaller specialised agents consistently outperform larger general-purpose ones. Not because they know more — because they have less to reconcile. A narrow frame with a clear role reaches a decision faster and produces outputs that slot cleanly into the next stage. A wide frame with multiple competing goals creates eddies at every decision point.

Three signals an agent has too many jobs:

No single kill signal — you cannot write one sentence that would tell you the agent is done and done well
Outputs don't slot — the downstream stage requires interpretation before it can use what the agent produced
Every run needs judgment — evaluation cannot be automated because success means different things in different runs

The fix is always decomposition: split the system prompt, separate the roles, run two small agents in sequence rather than one large agent in parallel with itself.

Five Types of Work

Every job an agent touches falls into one of five types. The type determines the autonomy level, the setpoint, and who owns the output.

Type	What It Is	Agent's Role	Human's Role	AI %
Strategy	Deep work — connecting dots, placing bets, setting direction	Generate options, surface blind spots	Conviction, trade-offs, vision	25%
Planning	Translating direction into specs, design docs, task sequences	Draft, structure, adversarial review	Scope, constraints, sign-off	50%
Production	Executing against a spec — code, copy, data processing, media	Generate, refactor, test, ship	Final review, judgment calls	75%
Compliance	Checking outputs against declared standards — QA, audit, legal	Flag gaps, auto-fix where possible	Interpret severity, exceptions	60%
Clerical	Routine tasks with known inputs, outputs, and rules	Execute fully, log receipts	Set the rules once	85%

Three rules from the table:

Strategy cannot be delegated whole. The agent generates options. The human holds conviction. An agent with a strategy setpoint and no human veto is a runaway loop — it optimises for what it can measure, not for what matters.
Production is where the leverage is. 75% AI % means the agent does most of the work. But the human must own the review — not as ceremony, but as the last gauge before the output ships.
Clerical is the automation target. If a task is in this row and you are still doing it manually, that is friction that the Work Charts matrix was built to surface. 85% AI % means automate or exit.

The five types also reveal what the agent's system prompt must carry. A clerical agent needs narrow tool scope and deterministic output format. A strategy agent needs broad context, inversion prompts, and an explicit human-in-the-loop checkpoint before any recommendation becomes a commitment.

Supply Chain Position

Every agent owns one stage of a value chain. The Work Charts map which stage that is.

The VVFL stations are the supply chain stages. Each station has a natural agent type — the frame and role combination that fits the work that station does:

Station	Stage	Agent type	Their one job
Capture	Signal ingestion	Scanner	Read feeds, flag signals worth routing
Priorities	Routing	Router	Score inputs, dispatch to the right station
Attention	Production	Executor	Transform input to output per best practice
Value	Quality gate	Verifier	Check output against declared standard
Systems	Automation	Trigger	Fire downstream processes when threshold met
Standards	Measurement	Auditor	Measure gap, read dashboard, surface drift
Distribute	Delivery	Publisher	Route outputs to the right consumer
Reflect	Review	Evaluator	Read receipts, surface patterns, name the gap
Evolve	Improvement	Improver	Refine templates and setpoints from evidence

Each agent owns one row. An agent that spans multiple rows creates coordination overhead that exceeds the value of specialisation — the extra capability costs more in prompt complexity and evaluation effort than it returns in throughput.

The supply chain is not the org chart. An agent does not report to another agent. It produces an output that a downstream agent consumes. The dependency is in the data, not in the authority. This is what makes multi-agent systems composable: replace one stage without touching any other.

Human-Agent Handoff

Not every job suits an agent. Five activities remain human regardless of capability:

Activity	Why it stays human
Reception	Reading what isn't said — subtext, body language, what was withheld
Selling	Trust transfers between consciousness, not between interfaces
Purpose	Direction requires conviction, not computation
Ethics	Trade-offs require values — optimising without values is just efficient harm
Taste	"Good enough" vs "good" requires lived context that cannot be injected

The AI Work Transformation map places every job in a two-axis matrix:

	High human edge	High AI edge
High demand	Humans lead, agents support	Learn to orchestrate
Low demand	Niche or specialisation	Automate or exit

Before deploying any agent: place its job in this matrix. If it lands in "humans lead" — the agent should be a support role (lookups, drafts, data) not a decision-maker. The decision stays with the human; the agent handles the work that feeds it.

The escalation path is not a failure mode. AI → Human escalation is the loop working correctly. The agent reaches the edge of its setpoint and hands off cleanly to the frame that can handle what comes next. Design the handoff before building the agent. An agent with no escalation path runs past its competence boundary every time that boundary is reached.

Context

VVFL Evolution — The loop that governs agent evolution: frames, roles, setpoints, and the stations each agent occupies
Agency — Character + capability: the three conditions and the loop that makes an agent genuinely agentic
Work Charts — VVFL stations mapped to human/AI ownership — where each agent type fits in the value chain
AI Work Transformation — Job-fit matrix: which work suits agents, which stays human
Agentic Coding — IDE copilot patterns: chaining, context discipline, two-stream workflow
Agentic Frameworks — Claude Code, Cursor, Codex — each at a different point on the autonomy spectrum
Autonomous Agents — Scheduled agents: cron, self-contained prompts, cross-session memory
Agentic Workflows — Technical patterns: context engineering, tool use, multi-agent orchestration
Archetypes — The five character frames: Dreamer, Philosopher, Engineer, Realist, Coach
Systems Thinking — The cognitive patterns agents embody
AI Models — Selecting the base intelligence for the role

Questions

If agency requires coupling to consequences, does an agent that runs without feeding its receipts back into prompt improvement have genuine agency — or is it just automation?

If the One Job Rule means one setpoint per agent, how do you detect when a system prompt has secretly accumulated two jobs?
Which VVFL station does your most-used agent occupy — and is it a stage where AI has a genuine edge, or one that should stay human?
When an agent escalates to a human (AI → Human), is that a failure mode to minimise or a design feature to celebrate?
Character for an agent means constraints, alignment, and reputation — which of those three is weakest in your current agent design, and what does that missing element cost per run?

The Agent​

Agent Frames​

Agent Roles​

Creating Agents​

How Agents Evolve​

Autonomy Spectrum​

Thinking Modes​

Giving Agents Agency​

One Job Rule​

Five Types of Work​

Supply Chain Position​

Human-Agent Handoff​

Context​

Links​

Questions​