Skip to main content

Claude Code

What makes an AI agent productive — the model, or the context you give it?

Code is commodity. Your prediction model is the moat. Claude Code shifts the role from "writer" to "orchestrator."


Concepts

Extensions plug into different parts of the agentic loop:

  • CLAUDE.md adds persistent context Claude sees every session
  • Skills add reusable knowledge and invocable workflows
  • MCP connects Claude to external services and tools
  • Subagents run their own loops in isolated context, returning summaries
  • Agent teams coordinate multiple independent sessions with shared tasks and peer-to-peer messaging
  • Hooks run outside the loop entirely as deterministic scripts
  • Plugins and marketplaces package and distribute these features

Skills are the most flexible extension. A skill is a markdown file containing knowledge, workflows, or instructions. You can invoke skills with a slash command like /deploy, or Claude can load them automatically when relevant. Skills can run in your current conversation or in an isolated context via subagents.

CLAUDE.md

FactDetail
WhatProject instructions loaded every session. The agent's orientation file.
WhereRoot CLAUDE.md, .claude/CLAUDE.md, CLAUDE.local.md, ~/.claude/CLAUDE.md
LoadedAutomatically at session start. Nested CLAUDE.md in subdirectories load on-demand when agent reads files there.
Imports@path/to/file inlines content. @.ai/rules/* imports all rules.

Our config: Root CLAUDE.md is a thin router — orientation block plus @.ai/rules/* imports. Keeps the file under 100 lines. Uses the agent-agnostic architecture: .ai/ is source of truth, CLAUDE.md is the Claude-specific entry point.

Best practice: If Claude ignores your rules, the file is too long. Each line should answer: "Would removing this cause Claude to make mistakes?" Everything else is noise.

Rules

FactDetail
WhatModular markdown files auto-loaded as project memory. Constraints, not procedures.
Where.claude/rules/*.md (discovered recursively)
LoadedEvery session, automatically. No invocation needed.
FilteringYAML frontmatter paths: ["src/**/*.ts"] makes rules conditional on file type.

Our config: 14 rules in .ai/rules/, symlinked into .claude/rules/. Each rule pairs with a hook for enforcement.

RulePreventsPaired Hook
page-flowContent before visual, missing contextdocs-post-edit.sh
content-standardsPreachy voice, long headingsdocs-post-edit.sh
mdx-patternsRaw < > { } in prosemdx-validator.py
build-processLocal build commandsPreToolUse inline block
design-verificationCode looks right but doesn't rendersrc-post-edit.sh

Lesson: Rules without hooks are suggestions under cognitive load. The page-flow rule existed for weeks before its first violation was caught — by a hook, not by the rule. If it matters, pair it.

Engineering repo: 17 rules covering architecture (hexagonal layers), testing (TDD mandatory), security (no hardcoded secrets), and design (CDD components). Path-filtered: TypeScript rules only fire for .ts/.tsx files.

Hooks

FactDetail
WhatShell commands or LLM prompts that fire on lifecycle events. Deterministic enforcement — not advisory.
Where.claude/settings.json under "hooks" key
EventsSessionStart, UserPromptSubmit, PreToolUse, PostToolUse, PostToolUseFailure, Stop, PreCompact, SubagentStart, Notification, SessionEnd
MatchersFilter by tool: "Write|Edit", "Bash(npm run *)", "ExitPlanMode"
ControlExit 0 = continue, Exit 2 = block. Return JSON to modify behavior.

Lifecycle:

SessionStart         → inject navigation frame
UserPromptSubmit → route /commands and /skills

PreToolUse → block builds, gate plans, audit commits

[Agent works]

PostToolUse → validate headings, MDX, content, design

Stop → warn about uncommitted changes

Our config (dream repo): 9 hooks across 5 events.

HookEventWhat
session-context.shSessionStartInjects navigation frame. No cold start.
user_prompt_submit.pyUserPromptSubmitRoutes /command-name to .ai/commands/*.md. Python for deterministic routing.
truth-seeking-gate.shPreToolUse (ExitPlanMode)One question: "What did you verify?" Prevents confident wrong code from unverified plans.
Inline blockPreToolUse (Bash+build)Hard-blocks build|dev|start. WSL dies on Docusaurus. Not preference — constraint.
docs-pre-commit.shPreToolUse (Bash+commit)Has /content-flow been run on staged docs?
mdx-validator.pyPostToolUse (md/mdx)Heading word count. Python for precision — counting, not judging.
docs-post-edit.shPostToolUse (docs/meta)Flow, standards, quality, fact/star architecture. The comprehensive validator.
src-post-edit.shPostToolUse (src/)Design anti-patterns: arbitrary Tailwind, div-as-button, missing alt.
stop-uncommitted.shStopWarns about uncommitted changes. Last safety net.

Engineering repo: 21 hooks including resource-guard.sh (blocks OOM typecheck commands), post-edit-typecheck.sh (runs diagnostics after edits), pre-compact-backup.sh (saves state before context compaction), subagent-context.sh (loads team context for subagents).

Best practice: PostToolUse for validation (need to see what changed). PreToolUse for blocking dangerous actions (need to prevent, not repair). Keep hooks fast — they block operations.

Skills

FactDetail
WhatReusable procedures with quality gates. Multi-step workflows loaded on demand.
Where.claude/skills/*/SKILL.md or .agents/skills/*/SKILL.md (AAIF standard)
LoadedOn-demand when invoked with /skill-name or auto-triggered by matching description.
Frontmattername, description, argument-hint, allowed-tools, model, context, agent, hooks
vs CommandsSkills have supporting files (templates, examples, scripts), tool restrictions, subagent config. Commands are single prompts.

Our config: 16 skills in .agents/skills/ per AAIF standard.

SkillPurposeKey Gate
content-flowFull content audit5 ordered checks with line-number evidence
score-prdsBatch-score PRDsCalibration examples + deterministic script
create-prdNew PRD with scoringProblem-first, hierarchy check, 5P scoring
copywritingMulti-voice draftsLennon/Naval/Wilde/Hemingway transform
deep-researchInvestigationSource diversity, claim verification
frontendComponent devDesign system tokens, render verification

Engineering repo: 29 skills including _core-planning (DB-native plan creation), _core-nx-generators (hexagonal scaffold), _core-typescript-expert (TS debugging protocol), plus 10 team-specific UI skills and 4 marketing skills.

Lesson: A skill invoked but not followed is worse than no skill — it creates false confidence. After declaring "all pass" while missing preachy text, wrong table order, and 7-column tables in one session, the skill-execution rule was added: quote evidence with line numbers or don't claim the gate passed.

Commands

FactDetail
WhatSimple slash commands defined as markdown files. Single prompt, no quality gates.
Where.claude/commands/*.md
LoadedRegistered on session start. Invoked with /command-name.
Arguments$ARGUMENTS variable substituted from user input after /command-name.
vs SkillsCommands are single prompts. If it needs gates, templates, or supporting files, make it a skill.

Our config: 23 commands in .ai/commands/, proxied through .claude/commands/.

PrefixDomainExamples
bd-Business developmentbd-icp-profile-research, bd-linkedin-post
mm-Mental modelmm-alignment, mm-systems-thinking
vvfl-Architecture loopsvvfl-flow, vvfl-skills
_Internal/meta_improve, _ssc-review
(none)Core workflowsship, sync, writers-room, eng-priorities

Engineering repo: 28 commands including /orch-plan, /orch-work, /orch-commission (orchestration flow), /activate-meta, /activate-ui, /activate-intel (team activation), and /git-feature, /git-sync, /git-pr (version control).

Agents

FactDetail
WhatSpecialized subagents with custom context, tools, and model. Run in isolated context windows.
Where.claude/agents/*.md or ~/.claude/agents/*.md
SpawnedVia Task tool or auto-delegated based on description match.
Frontmattername, description, tools, disallowedTools, model, maxTurns, skills, memory, isolation
Built-inExplore (fast, read-only), Plan (architecture), general-purpose (full tools)
Isolationisolation: worktree creates a temporary git worktree for safe parallel work.

Our config (dream repo): Uses built-in agents (Explore, Plan, general-purpose) for task delegation. No custom agent files — the dream repo is a single-operator system.

Engineering repo: 16 custom agents across 5 worktree teams.

AgentModelRole
orchestratorOpusPlan composition, team coordination
lead-developerSonnetImplementation, bug fixes
algorithm-engineerOpusPure algorithms in libs/agency/
ui-cdd-architectOpusComponent design
code-reviewerOpusArchitecture compliance before PR
security-guardianOpusVulnerability management
e2e-runnerSonnetPlaywright E2E validation

Assembly line pattern: Intent (test-engineer writes spec first) → Explore (architect designs) → Build (developer implements) → Verify (reviewer audits). The builder never validates their own work.

Best practice: Opus for architecture decisions (deepest reasoning). Sonnet for execution (balanced speed/quality). Haiku for quick fixes. Match cognitive demand to model capability.

Settings

FactDetail
WhatJSON config controlling permissions, hooks, plugins, environment, and sandbox.
Where.claude/settings.json (team), .claude/settings.local.json (personal), ~/.claude/settings.json (global)
PrecedenceManaged (highest) → CLI flags → local → project → user (lowest)
Permissionsallow, deny, ask arrays with tool patterns: "Bash(npm run *)", "Read(./src/**)"

Our config:

SettingValueWhy
alwaysThinkingEnabledtrueArchitecture and multi-file refactors need thinking tokens. Cost justified by fewer rework cycles.
enabledPluginscontext7, feature-dev, code-review, frontend-design, typescript-lspcontext7 for live docs. feature-dev for guided implementation. typescript-lsp for type checking without builds.
Deny listEdit/Write to /home/wik/code/sm/**Engineering repo isolation. Dream repo reads but never writes to engineering.

Engineering repo: Granular permissions — allows Read test env files, Nx/git commands, task tools; denies production .env, secrets, git push origin main. Session overrides in settings.local.json for one-off debugging (Docker, database operations).

Best practice: Separate production constraints (settings.json) from session flexibility (settings.local.json). Deny dangerous actions explicitly. Allow with patterns, not blanket permissions.

MCP Servers

FactDetail
WhatModel Context Protocol — open standard for connecting to external tools, databases, APIs.
Where.mcp.json (team), ~/.claude.json (personal)
Transportshttp (cloud), stdio (local processes)
CLIclaude mcp add, claude mcp list, claude mcp remove

Our config: Perplexity (search), Playwright (browser automation). Engineering repo adds GitHub (PR/issue integration).

Best practice: Use .mcp.json for team-shared servers (check into git). Use environment variable expansion (${API_KEY}) for secrets. Keep server count low — each server adds context overhead.

Plugins

FactDetail
WhatPackaged extensions containing skills, agents, hooks, MCP servers. Distributed via marketplace.
WhereMarketplace install or --plugin-dir ./my-plugin for local development
Structure.claude-plugin/plugin.json manifest + skills/, agents/, hooks/, .mcp.json
NamespacingPlugin skills accessed as /plugin-name:skill-name

Our config: context7 (live documentation search), feature-dev (guided implementation), code-review (PR review), frontend-design (UI generation), typescript-lsp (type checking without builds).

Engineering repo: github (PR/issue integration), ralph-loop (long-running autonomous execution), context7.

Best practice: Plugins are for capabilities you want across projects or shared with a team. If it's project-specific, use skills and hooks directly. Don't install plugins you don't actively use — each adds to context overhead.

Memory

FactDetail
WhatPersistent context that survives across sessions. Auto-memory (Claude writes) + CLAUDE.md (human writes).
Where~/.claude/projects/&lt;project-id&gt;/memory/MEMORY.md (auto), plus topic files in same directory
LoadedMEMORY.md first 200 lines loaded every session. Topic files referenced from MEMORY.md.
Manage/memory command to view/edit. Tell Claude "remember X" to save.

Our config: Auto-memory captures hard-won lessons (content flow failures, design verification failures, PRD hierarchy detection, skill execution failures). Topic files organize by pattern, not chronology.

Best practice: Keep MEMORY.md under 200 lines — move details to topic files. Save stable patterns confirmed across multiple interactions, not session-specific context. Delete memories that become outdated.


Enforcement Hierarchy

Rules, hooks, and skills form a progression. Strongest guarantee first.

TierMechanismGuaranteeExample
1Hook (PreToolUse, exit 2)Block before executionBuild blocker prevents pnpm dev
2Hook (PostToolUse)Catch and warn aftermdx-validator.py flags long headings
3Rule (always loaded)In context, agent should followpage-flow.md defines content order
4Skill (on demand)Procedure with gatescontent-flow audits a page
5MemorySoft pattern"We use pnpm"

Engineering repo adds a sixth tier above all: Nx generators that produce correct code structurally. The hexagonal scaffold generator makes layering violations impossible — you can't import across layers if the file structure doesn't allow it.

Principle: Push enforcement up the hierarchy. If a rule matters, pair it with a hook. If a hook isn't enough, make a generator. Cognitive enforcement fails under load. Structural enforcement doesn't.


Two-Repo Config

We run two repos with different config profiles. Same concepts, different emphasis.

AspectDream Repo (this)Engineering Repo
PurposeWHY + WHAT (strategy, specs, content)HOW (implementation, shipping)
Rules14 (content flow, MDX, design)17 (architecture, testing, security, design)
Hooks9 (content validation, build blocking)21 (typecheck, resource guard, team activation, compaction backup)
Skills16 (content, PRD, research, design)29 (planning, generators, testing, team-specific)
Commands23 (content pipeline, business dev)28 (orchestration, git, team activation)
AgentsBuilt-in only (single operator)16 custom (5 worktree teams, assembly line)
Pluginscontext7, feature-dev, code-review, frontend-design, typescript-lspgithub, ralph-loop, context7

Cross-repo boundary: Dream repo denies Edit/Write to /home/wik/code/sm/**. Neither repo writes to the other's filesystem. Communication via Convex messages and Supabase measurements.


Scoring Pipeline

The scoring pipeline shows how rules, skills, and scripts work together. LLMs assign scores using calibration examples. node scripts/prioritise-prds.mjs does the math. The team that scores is never the team that computes the ranking.

This pattern maps to Anthropic's programmatic tool calling — same split (judgment separated from computation) at the API level. See Tools for the full mapping.


Best Practice Guide

Principles distilled from running Claude Code across two repos, 14 rules, 30 hooks, 45 skills, and 32 agents.

1. Structure over discipline. If it matters, enforce it structurally. Rules are suggestions. Hooks are guarantees. Generators make violations impossible.

2. Pair rules with hooks. Every rule that's been violated under cognitive load now has a hook. The pattern: rule defines the standard, hook catches violations. Neither works alone.

3. PostToolUse for validation, PreToolUse for blocking. You validate what changed (Post). You block what shouldn't happen (Pre). Don't use PreToolUse for validation — you don't know what will change yet.

4. Separate judgment from computation. LLMs score. Scripts rank. Agents draft. Validators review. The builder never validates their own work. This is commissioning applied to code.

5. Skills need evidence gates. "All pass" without line numbers is a red flag. Quote the evidence or don't claim the gate passed. A skill invoked but not followed creates false confidence.

6. Keep CLAUDE.md thin. Router, not repository. Import rules with @, don't inline them. If Claude ignores instructions, the file is too long.

7. Single source of truth. .ai/rules/ is real. .claude/rules/ is symlinks. Edit once, every agent sees the change. Never maintain two copies of the same rule.

8. Match model to task. Opus for architecture. Sonnet for execution. Haiku for quick fixes. Use model in agent/skill frontmatter to override per-task.

9. Isolate dangerous work. isolation: worktree for agents that might break things. Deny lists for repos you shouldn't write to. Separate settings.json (production) from settings.local.json (session flexibility).

10. Memory is for patterns, not sessions. Save stable conventions confirmed across multiple interactions. Delete outdated memories. Keep MEMORY.md under 200 lines.


Innovators

People pushing the boundaries of agentic coding. Follow their work.

WhoContributionWhere to Follow
Boris ChernyCreator and Head of Claude Code at Anthropic. Built the prototype in two months, runs 5 parallel Claudes, 80-90% of Claude Code is written by Claude Code. Author of Programming TypeScript.Lenny's Podcast interview, Pragmatic Engineer deep dive
Andrej KarpathyCoined "vibe coding" (Feb 2025), then retired the term for "agentic engineering" (Feb 2026). The shift: from autocomplete to orchestrating agents with oversight. Co-founder OpenAI, former Tesla AI.X @karpathy
Simon WillisonIndependent developer, creator of Datasette. Relentless documenter of AI tools. Called skills "maybe a bigger deal than MCP." Publishes TILs, transcripts, and security analysis of every Claude Code feature.simonwillison.net, Substack
Jesse VincentCreator of Superpowers — the most popular Claude Code plugin (93K+ GitHub stars). Enforces strict RED-GREEN-REFACTOR TDD. Simon Willison calls him "one of the most creative users of coding agents."GitHub obra/superpowers, Blog
Gergely OroszThe Pragmatic Engineer. Published the definitive deep dives on how Claude Code is built and what agentic engineering means for staff+ roles.Pragmatic Engineer
Rick HightowerCreator of SkillzWave (44K+ agentic skills). Wrote the clearest breakdown of Agent Skills 2.0 — from custom instructions to programmable agents.Medium

OODA Loop

Configuration is not a one-time setup. Features ship weekly. Practitioners innovate daily. A config that was optimal last month is leaving potential on the table today.

The OODA loop applied to AI coding tools:

OBSERVE  →  What shipped? What are practitioners doing?
ORIENT → Map new capabilities against current config. Score each affordance 0-5.
DECIDE → Adopt, reject, or investigate — with evidence, in this session.
ACT → Change the config. Update the baseline. Start the next cycle.

Daily

StepActionSource
OCheck version. Scan changelog for new features since last session.claude --version, changelog
OScan practitioner feeds for new patterns.GitHub discussions, X, Reddit
OFor each new feature: are we using it? Should we be?Feature → config comparison
DTest each "should be = yes" feature NOW.Load tool, run command, try config
AAdopt or reject with evidence. Zero deferrals. Update baseline.Config files, baseline.json

Start an hourly background loop for the rest of the session:

/loop 1h check claude changelog and practitioner feeds for new features, test against current config

Weekly

Everything daily plus:

StepAction
ODeep research: multi-agent patterns, hook innovations, token optimization, skill architecture
OFor every affordance scoring below 4, find a practitioner who scores higher
OInventory: count rules, commands, agents, skills, hooks. Diff against last week.
OCheck tool stack: CLIs available, MCP token tax, plugin health
DFive specific fixes — prioritized: broken > redundant > missing > optimizing
AApply fixes. Update baseline.

Monthly

Everything weekly plus:

StepAction
OBuild full capability map (20+ rows). Every feature that could change config.
OControl system diagnosis: map every element to a concern and enforcement tier
ODetect redundancy (rules duplicating hooks), gaps (concerns with no controller), misplacement (commands that should be skills)
DFive fixes with file paths, changes, concern mapping, tier assignments
AApply fixes. Update baseline. Legacy Rule: improve the procedure itself.

Affordance Tracking

Every AI coding tool feature is an affordance — a capability ceiling. Track not just whether you use each feature, but how much of its potential you extract.

AffordanceKey questionScore 0-5
HooksIs every moment that matters automated?How many of 15 event types are wired?
AgentsIs every agent tuned to its exact job?Model, effort, tools, memory, maxTurns all set?
SkillsIs every repeatable workflow a skill with gates?Effort frontmatter, receipts, progressive disclosure?
SchedulingDoes work happen without being asked?Session loops, cloud triggers, desktop tasks?
MemoryDoes the agent start every session smarter?Timestamps, agent memory, validation hooks?
WorktreesIs every code-writing agent isolated?Sparse checkout, default isolation?
PluginsAre community patterns adopted?Persistent state, inline sources?
MCPAre the right tools connected at justified cost?Token tax measured? Channels? Elicitation?
CLI flagsAre session behaviors optimized?--bare, --name, --worktree, --from-pr?

The test: Could someone starting fresh with these docs reach 80%+ affordance utilization? If not, the procedure has gaps.

Three Scheduling Tiers

TierMechanismPersists?Needs machine?Min interval
Session/loop, CronCreateNo — dies with sessionYes1 min
Cloud/schedule, cloud tasks at claude.ai/code/scheduledYes — Anthropic infraNo1 hour
DesktopDesktop app schedulerYes — your machineYes (no open session needed)1 min

Use session scheduling for in-session monitoring (link validation, context health). Use cloud scheduling for autonomous recurring work (daily scans, weekly reindex). Use desktop scheduling when you need local file access without an open session.

CLI vs MCP

MCP servers load tool definitions into context every session — that's a token tax whether the tools are used or not. CLIs cost zero until invoked.

Use CLI whenUse MCP when
Tool invoked rarelyTool invoked frequently with structured params
Simple text I/OComplex structured I/O needed
Already works via BashAgent needs to discover it exists
Token budget is tightStructured output justifies context cost

Practitioner Tracking

The best innovations come from practitioners, not changelogs. Track who's pushing boundaries and what patterns they've discovered.

See Innovators for the current list. Update during every weekly scan — the community moves fast.

Context

Questions

What percentage of your AI coding tool's affordances are you actually using — and what would 80% look like?

  • OBSERVE: What feature shipped this week that you haven't tested yet? If the answer is "I don't know" — the loop is broken.
  • ORIENT: For each affordance scoring below 4, who in the community has solved it? What can you steal?
  • DECIDE: If you could only adopt one new feature today, which one compounds the most? What's the second-order effect?
  • ACT: When was the last time you changed your config because a new feature shipped — not because something broke?