Skip to main content

Agents

What separates a tool from a character — and why does the distinction determine what the loop produces?

An agent is not software with opinions. It is a declared operating mode: a character occupying a role in a feedback loop, equipped with tools, given a setpoint, and asked to run until the gap closes. The hero with a thousand faces — same underlying architecture, different costume. The costume is the system prompt.

The Agent

An AI agent is a system prompt recipe — prose, tools, permissions, model, and limits in one artifact. Writing a good recipe requires systems thinking; without feedback loops in the design, you get a mindset without a gauge.

Two prompts combine to form a complete agent:

PromptAnswersWhat it declares
System promptWho is processing?Mindset, constraints, tools, permissions, character
Command promptWhat is being done?Task, input, tools, know-how, expected output

The system prompt is the alter-ego. The command prompt is the job. Together they form a loop: the alter-ego determines how the job is processed; the job tests whether the alter-ego is right for the moment.

Agent Frames

A frame is the character the agent wears. Frames are independent of the loop station the agent occupies. Two axes, two different questions.

Rhetorical frames (Greek)

The Greeks answer: what kind of appeal does this agent make?

FrameModeUsed when
PathosEmotional appealAudience needs to feel before they can think
EthosCredibility and characterTrust is the bottleneck
LogosLogic and evidenceBelief is blocked by missing proof
KairosTiming, the opportune momentThe right argument at the wrong moment fails
ToposCommon ground, shared placeDisagreement on basics blocks everything else

Character frames (Archetypes)

Archetypes answer: what cognitive mode does this agent embody?

FrameModeActivates when
DreamerSee the unseen, sell belief before proofStarting something new, hope is absent
PhilosopherSeek truth, question everythingDirection feels wrong, assumptions unchallenged
EngineerBuild precisely, verify claimsMoving from intent to working system
RealistFace what is, ground in evidenceEnergy and optimism need a floor
CoachUnlock others, ask not tellPeople are stuck, capability is the ceiling

Agent Roles

A role is where in the VVFL loop the agent stands. Frames are what it thinks; roles are what it does.

SymbolStationThe agent's job
HopperPrioritisationCapture all signals; filter what's worth acting on
PumpImplementationTransform inputs to outputs following best practices
GaugeInstrumentationMeasure what matters; read early signals and self-correct
FeedbackQuality AssuranceAre results and outcomes aligned with intentions?
InvisibleIntentionPerspective on the big picture; is this the best use of time and energy?

A well-written system prompt names both frame and role. Frame without role produces a character with no post. Role without frame produces a function with no judgment.

Creating Agents

Four elements combine:

  1. Model — the base intelligence. Different models trade reasoning depth for speed, cost, context window. Selection is a first-principles decision, not a default.
  2. System prompt — the alter-ego. Prose that declares who the agent is: purpose, constraints, tools it can use, limits it must respect, character it embodies.
  3. Tools and permissions — the hands. What the agent can reach in the world: file systems, APIs, databases, browsers, other agents. Tools without constraints are runaway loops.
  4. Context — the current state. What the agent knows right now: working memory, loaded files, prior outputs. Context is the fuel; system prompt is the engine.

The system prompt is not the secret differentiator. The ecosystem around it — user experience, connections, security protocols, feedback loops — determines how far the agent can travel.

How Agents Evolve

Apply the VVFL to the agent itself. The loop that governs how decisions improve also governs how prompts improve.

DECLARE SETPOINT → RUN LOOP → MEASURE GAP → REFINE PROMPT → REPEAT
↑ |
└───── evolve the setpoint when reality teaches ───────────┘

Each cycle:

StationWhat evolves
CaptureObserved failure modes and edge cases enter the hopper
FilterWhich failures expose a prompt flaw vs a task flaw?
PumpRefine the specific system prompt element that caused the gap
GaugeMeasure: did the refinement close the gap or just move it?
ReflectIs the frame right for this role? Wrong archetype wastes a good prompt
EvolveImprove the prompt-writing template — not just this prompt

The Legacy Rule: when an agent finishes a job, it improves the template for the next agent. Every run is both output and training data for the recipe. An agent that runs without improving its prompt is a corrective loop. An agent that runs and improves its prompt is a VVFL.

The fastest-evolving agents are the ones with the clearest setpoints. Constraints enable autonomy. A system prompt with one declared north star runs more autonomously than a prompt with ten competing goals.

Autonomy Spectrum

Two ends of the same architecture. The difference is not capability — it is who triggers the run and who reviews the output.

IDE Copilot — the agent that sits beside you while you work

  • Trigger: your prompt — you ask, it does
  • Context: shared with you — you see the repo, so does the agent; human fills any gap
  • Review: you inspect every output before it moves; nothing ships without your eyes on it
  • Memory: session context — CLAUDE.md reloads orientation each session; clears between sessions
  • Error handling: you correct live; the agent does not need to self-recover
  • Purpose: amplifier for human thinking — do this faster, better, with less friction

Autonomous Agent — the agent that runs while you sleep

  • Trigger: schedule or event — fires without your prompting
  • Context: self-contained — the prompt must carry everything; there is no human to fill gaps
  • Review: async — receipts reviewed after the run; errors discovered post-execution
  • Memory: explicit MEMORY.md injected at session start; persists across sessions
  • Error handling: agent must handle errors within the run or log them for review
  • Purpose: background execution — do this while I sleep, surface what matters when I wake

The five deciding dimensions:

DimensionIDE CopilotAutonomous Agent
TriggerHuman promptSchedule or event
ContextShared with humanSelf-contained
ReviewEvery outputReceipts, after the run
MemorySession (CLAUDE.md)Cross-session (MEMORY.md)
Error handlingHuman corrects liveAgent self-recovers or logs

Design rule: Decide which end before writing the system prompt. The choice changes the entire prompt architecture:

  • Copilot: lean system prompt (human fills gaps), broad tool scope (human decides what to run), no self-recovery logic needed
  • Autonomous: fat system prompt (no human to fill gaps), narrow scoped tools (agent decides what to run), explicit error paths required

Most agents sit between the two ends. The useful test: how long is the review lag? Zero lag = copilot. Hours or days = autonomous. The longer the lag, the more self-contained the prompt must be, the narrower the tool scope, the more explicit the escalation path.

The agentic frameworks — Claude Code, Cursor, Codex — each sit at a different point on this spectrum. Choose the framework after you've decided the autonomy level, not before.

Thinking Modes

Agents are embodied ways of thinking. The same thinking patterns that make humans effective — chain of thought, inversion, outsider perspective — become replicable when encoded in a system prompt.

Thinking modeAgent benefit
Chain of thoughtMakes reasoning visible, catchable, correctable
InversionDeclares failure causes before the loop runs
Outsider thinkingPrevents the frame from becoming a blind spot
Deep work then fast actionSeparates research mode from execution mode

The Tight Five loops — the five-station cognitive cycle — map directly to what a well-designed agent runs inside its own reasoning sequence.

Giving Agents Agency

Agency is character plus capability — the capacity to make a meaningful difference that is coupled to real consequences. An agent has software capability by default. Character is what you give it.

Three conditions determine whether an agent has genuine agency or is just automation dressed as intelligence:

ConditionWhat it means for an agent
Genuine PossibilityTools and permissions unlocked that actually change the world — read files, post messages, trigger processes. An agent that can only talk has no agency.
Meaningful DifferenceA setpoint that serves beyond its own execution. Not "complete the task" — that is just a function. A setpoint that names what good looks like for the person the agent works for.
Coupling to ConsequencesReceipts that feed back into prompt evolution. If nothing changes when the agent fails, the loop is decorative. Consequences are what make the loop real.

The character stack for a digital agent mirrors the human one:

DimensionHumanAgent
CharacterValues, identity, trustConstraints, alignment, reputation
CapabilitiesPattern skills, judgmentSpeed, scale, consistency in its domain
CapitalSocial, intellectual, financialData, compute, context
DriversPurpose, meaning, belongingObjective function, reward signal, aligned incentives

The hardest element to install is the first: constraints. An agent without constraints is not free — it is adrift. The constraint is the character. It is what makes "I won't do that" meaningful and "I will do this well" trustworthy. Constraints enable autonomy. Autonomy without constraints is drift.

The agent's loop mirrors the human one: Perceive → Question → Act → Measure → Learn. Each pass through the loop either tightens the character (constraints proven in action) or reveals the setpoint was wrong (evolve and restate).

One Job Rule

The VVFL setpoint principle states it plainly: one loop, one declared target. That principle extends to agent design without modification.

One setpoint → one frame → one role → one agent.

Design choiceLoop typeWhat happens
One job, clear setpointCorrective → VVFLLaminar — converges on declared intent
Two jobs, one promptReinforcingTurbulent — competing goals create eddies
One job, no setpointReinforcingDrift — busy but directionless

Smaller specialised agents consistently outperform larger general-purpose ones. Not because they know more — because they have less to reconcile. A narrow frame with a clear role reaches a decision faster and produces outputs that slot cleanly into the next stage. A wide frame with multiple competing goals creates eddies at every decision point.

Three signals an agent has too many jobs:

  1. No single kill signal — you cannot write one sentence that would tell you the agent is done and done well
  2. Outputs don't slot — the downstream stage requires interpretation before it can use what the agent produced
  3. Every run needs judgment — evaluation cannot be automated because success means different things in different runs

The fix is always decomposition: split the system prompt, separate the roles, run two small agents in sequence rather than one large agent in parallel with itself.

Five Types of Work

Every job an agent touches falls into one of five types. The type determines the autonomy level, the setpoint, and who owns the output.

TypeWhat It IsAgent's RoleHuman's RoleAI %
StrategyDeep work — connecting dots, placing bets, setting directionGenerate options, surface blind spotsConviction, trade-offs, vision25%
PlanningTranslating direction into specs, design docs, task sequencesDraft, structure, adversarial reviewScope, constraints, sign-off50%
ProductionExecuting against a spec — code, copy, data processing, mediaGenerate, refactor, test, shipFinal review, judgment calls75%
ComplianceChecking outputs against declared standards — QA, audit, legalFlag gaps, auto-fix where possibleInterpret severity, exceptions60%
ClericalRoutine tasks with known inputs, outputs, and rulesExecute fully, log receiptsSet the rules once85%

Three rules from the table:

  1. Strategy cannot be delegated whole. The agent generates options. The human holds conviction. An agent with a strategy setpoint and no human veto is a runaway loop — it optimises for what it can measure, not for what matters.
  2. Production is where the leverage is. 75% AI % means the agent does most of the work. But the human must own the review — not as ceremony, but as the last gauge before the output ships.
  3. Clerical is the automation target. If a task is in this row and you are still doing it manually, that is friction that the Work Charts matrix was built to surface. 85% AI % means automate or exit.

The five types also reveal what the agent's system prompt must carry. A clerical agent needs narrow tool scope and deterministic output format. A strategy agent needs broad context, inversion prompts, and an explicit human-in-the-loop checkpoint before any recommendation becomes a commitment.

Supply Chain Position

Every agent owns one stage of a value chain. The Work Charts map which stage that is.

The VVFL stations are the supply chain stages. Each station has a natural agent type — the frame and role combination that fits the work that station does:

StationStageAgent typeTheir one job
CaptureSignal ingestionScannerRead feeds, flag signals worth routing
PrioritiesRoutingRouterScore inputs, dispatch to the right station
AttentionProductionExecutorTransform input to output per best practice
ValueQuality gateVerifierCheck output against declared standard
SystemsAutomationTriggerFire downstream processes when threshold met
StandardsMeasurementAuditorMeasure gap, read dashboard, surface drift
DistributeDeliveryPublisherRoute outputs to the right consumer
ReflectReviewEvaluatorRead receipts, surface patterns, name the gap
EvolveImprovementImproverRefine templates and setpoints from evidence

Each agent owns one row. An agent that spans multiple rows creates coordination overhead that exceeds the value of specialisation — the extra capability costs more in prompt complexity and evaluation effort than it returns in throughput.

The supply chain is not the org chart. An agent does not report to another agent. It produces an output that a downstream agent consumes. The dependency is in the data, not in the authority. This is what makes multi-agent systems composable: replace one stage without touching any other.

Human-Agent Handoff

Not every job suits an agent. Five activities remain human regardless of capability:

ActivityWhy it stays human
ReceptionReading what isn't said — subtext, body language, what was withheld
SellingTrust transfers between consciousness, not between interfaces
PurposeDirection requires conviction, not computation
EthicsTrade-offs require values — optimising without values is just efficient harm
Taste"Good enough" vs "good" requires lived context that cannot be injected

The AI Work Transformation map places every job in a two-axis matrix:

High human edgeHigh AI edge
High demandHumans lead, agents supportLearn to orchestrate
Low demandNiche or specialisationAutomate or exit

Before deploying any agent: place its job in this matrix. If it lands in "humans lead" — the agent should be a support role (lookups, drafts, data) not a decision-maker. The decision stays with the human; the agent handles the work that feeds it.

The escalation path is not a failure mode. AI → Human escalation is the loop working correctly. The agent reaches the edge of its setpoint and hands off cleanly to the frame that can handle what comes next. Design the handoff before building the agent. An agent with no escalation path runs past its competence boundary every time that boundary is reached.

Context

  • VVFL Evolution — The loop that governs agent evolution: frames, roles, setpoints, and the stations each agent occupies
  • Agency — Character + capability: the three conditions and the loop that makes an agent genuinely agentic
  • Work Charts — VVFL stations mapped to human/AI ownership — where each agent type fits in the value chain
  • AI Work Transformation — Job-fit matrix: which work suits agents, which stays human
  • Agentic Coding — IDE copilot patterns: chaining, context discipline, two-stream workflow
  • Agentic Frameworks — Claude Code, Cursor, Codex — each at a different point on the autonomy spectrum
  • Autonomous Agents — Scheduled agents: cron, self-contained prompts, cross-session memory
  • Agentic Workflows — Technical patterns: context engineering, tool use, multi-agent orchestration
  • Archetypes — The five character frames: Dreamer, Philosopher, Engineer, Realist, Coach
  • Systems Thinking — The cognitive patterns agents embody
  • AI Models — Selecting the base intelligence for the role

Questions

If agency requires coupling to consequences, does an agent that runs without feeding its receipts back into prompt improvement have genuine agency — or is it just automation?

  • If the One Job Rule means one setpoint per agent, how do you detect when a system prompt has secretly accumulated two jobs?
  • Which VVFL station does your most-used agent occupy — and is it a stage where AI has a genuine edge, or one that should stay human?
  • When an agent escalates to a human (AI → Human), is that a failure mode to minimise or a design feature to celebrate?
  • Character for an agent means constraints, alignment, and reputation — which of those three is weakest in your current agent design, and what does that missing element cost per run?