Claude Code

What makes an AI agent productive — the model, or the context you give it?

Code is commodity. Your prediction model is the moat. Claude Code shifts the role from "writer" to "orchestrator."

Concepts

Extensions plug into different parts of the agentic loop:

CLAUDE.md adds persistent context Claude sees every session
Skills add reusable knowledge and invocable workflows
MCP connects Claude to external services and tools
Subagents run their own loops in isolated context, returning summaries
Agent teams coordinate multiple independent sessions with shared tasks and peer-to-peer messaging
Hooks run outside the loop entirely as deterministic scripts
Plugins and marketplaces package and distribute these features

Skills are the most flexible extension. A skill is a markdown file containing knowledge, workflows, or instructions. You can invoke skills with a slash command like /deploy, or Claude can load them automatically when relevant. Skills can run in your current conversation or in an isolated context via subagents.

CLAUDE.md

Fact	Detail
What	Project instructions loaded every session. The agent's orientation file.
Where	Root `CLAUDE.md`, `.claude/CLAUDE.md`, `CLAUDE.local.md`, `~/.claude/CLAUDE.md`
Loaded	Automatically at session start. Nested `CLAUDE.md` in subdirectories load on-demand when agent reads files there.
Imports	`@path/to/file` inlines content. `@.ai/rules/*` imports all rules.

Our config: Root CLAUDE.md is a thin router — orientation block plus @.ai/rules/* imports. Keeps the file under 100 lines. Uses the agent-agnostic architecture: .ai/ is source of truth, CLAUDE.md is the Claude-specific entry point.

Best practice: If Claude ignores your rules, the file is too long. Each line should answer: "Would removing this cause Claude to make mistakes?" Everything else is noise.

Rules

Fact	Detail
What	Modular markdown files auto-loaded as project memory. Constraints, not procedures.
Where	`.claude/rules/*.md` (discovered recursively)
Loaded	Every session, automatically. No invocation needed.
Filtering	YAML frontmatter `paths: ["src/*/.ts"]` makes rules conditional on file type.

Our config: 14 rules in .ai/rules/, symlinked into .claude/rules/. Each rule pairs with a hook for enforcement.

Rule	Prevents	Paired Hook
`page-flow`	Content before visual, missing context	`docs-post-edit.sh`
`content-standards`	Preachy voice, long headings	`docs-post-edit.sh`
`mdx-patterns`	Raw `< > { }` in prose	`mdx-validator.py`
`build-process`	Local build commands	PreToolUse inline block
`design-verification`	Code looks right but doesn't render	`src-post-edit.sh`

Lesson: Rules without hooks are suggestions under cognitive load. The page-flow rule existed for weeks before its first violation was caught — by a hook, not by the rule. If it matters, pair it.

Engineering repo: 17 rules covering architecture (hexagonal layers), testing (TDD mandatory), security (no hardcoded secrets), and design (CDD components). Path-filtered: TypeScript rules only fire for .ts/.tsx files.

Hooks

Fact	Detail
What	Shell commands or LLM prompts that fire on lifecycle events. Deterministic enforcement — not advisory.
Where	`.claude/settings.json` under `"hooks"` key
Events	SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, PostToolUseFailure, Stop, PreCompact, SubagentStart, Notification, SessionEnd
Matchers	Filter by tool: `"Write\|Edit"`, `"Bash(npm run *)"`, `"ExitPlanMode"`
Control	Exit 0 = continue, Exit 2 = block. Return JSON to modify behavior.

Lifecycle:

SessionStart         → inject navigation frame
UserPromptSubmit     → route /commands and /skills
  ↓
PreToolUse           → block builds, gate plans, audit commits
  ↓
[Agent works]
  ↓
PostToolUse          → validate headings, MDX, content, design
  ↓
Stop                 → warn about uncommitted changes

Our config (dream repo): 9 hooks across 5 events.

Hook	Event	What
`session-context.sh`	SessionStart	Injects navigation frame. No cold start.
`user_prompt_submit.py`	UserPromptSubmit	Routes `/command-name` to `.ai/commands/*.md`. Python for deterministic routing.
`truth-seeking-gate.sh`	PreToolUse (ExitPlanMode)	One question: "What did you verify?" Prevents confident wrong code from unverified plans.
Inline block	PreToolUse (Bash+build)	Hard-blocks `build\|dev\|start`. WSL dies on Docusaurus. Not preference — constraint.
`docs-pre-commit.sh`	PreToolUse (Bash+commit)	Has `/content-flow` been run on staged docs?
`mdx-validator.py`	PostToolUse (md/mdx)	Heading word count. Python for precision — counting, not judging.
`docs-post-edit.sh`	PostToolUse (docs/meta)	Flow, standards, quality, fact/star architecture. The comprehensive validator.
`src-post-edit.sh`	PostToolUse (src/)	Design anti-patterns: arbitrary Tailwind, div-as-button, missing alt.
`stop-uncommitted.sh`	Stop	Warns about uncommitted changes. Last safety net.

Engineering repo: 21 hooks including resource-guard.sh (blocks OOM typecheck commands), post-edit-typecheck.sh (runs diagnostics after edits), pre-compact-backup.sh (saves state before context compaction), subagent-context.sh (loads team context for subagents).

Best practice: PostToolUse for validation (need to see what changed). PreToolUse for blocking dangerous actions (need to prevent, not repair). Keep hooks fast — they block operations.

Skills

Fact	Detail
What	Reusable procedures with quality gates. Multi-step workflows loaded on demand.
Where	`.claude/skills//SKILL.md` or `.agents/skills//SKILL.md` (AAIF standard)
Loaded	On-demand when invoked with `/skill-name` or auto-triggered by matching `description`.
Frontmatter	`name`, `description`, `argument-hint`, `allowed-tools`, `model`, `context`, `agent`, `hooks`
vs Commands	Skills have supporting files (templates, examples, scripts), tool restrictions, subagent config. Commands are single prompts.

Our config: 16 skills in .agents/skills/ per AAIF standard.

Skill	Purpose	Key Gate
`content-flow`	Full content audit	5 ordered checks with line-number evidence
`score-prds`	Batch-score PRDs	Calibration examples + deterministic script
`create-prd`	New PRD with scoring	Problem-first, hierarchy check, 5P scoring
`copywriting`	Multi-voice drafts	Lennon/Naval/Wilde/Hemingway transform
`deep-research`	Investigation	Source diversity, claim verification
`frontend`	Component dev	Design system tokens, render verification

Engineering repo: 29 skills including _core-planning (DB-native plan creation), _core-nx-generators (hexagonal scaffold), _core-typescript-expert (TS debugging protocol), plus 10 team-specific UI skills and 4 marketing skills.

Lesson: A skill invoked but not followed is worse than no skill — it creates false confidence. After declaring "all pass" while missing preachy text, wrong table order, and 7-column tables in one session, the skill-execution rule was added: quote evidence with line numbers or don't claim the gate passed.

Commands

Fact	Detail
What	Simple slash commands defined as markdown files. Single prompt, no quality gates.
Where	`.claude/commands/*.md`
Loaded	Registered on session start. Invoked with `/command-name`.
Arguments	`$ARGUMENTS` variable substituted from user input after `/command-name`.
vs Skills	Commands are single prompts. If it needs gates, templates, or supporting files, make it a skill.

Our config: 23 commands in .ai/commands/, proxied through .claude/commands/.

Prefix	Domain	Examples
`bd-`	Business development	`bd-icp-profile-research`, `bd-linkedin-post`
`mm-`	Mental model	`mm-alignment`, `mm-systems-thinking`
`vvfl-`	Architecture loops	`vvfl-flow`, `vvfl-skills`
`_`	Internal/meta	`_improve`, `_ssc-review`
(none)	Core workflows	`ship`, `sync`, `writers-room`, `eng-priorities`

Engineering repo: 28 commands including /orch-plan, /orch-work, /orch-commission (orchestration flow), /activate-meta, /activate-ui, /activate-intel (team activation), and /git-feature, /git-sync, /git-pr (version control).

Agents

Fact	Detail
What	Specialized subagents with custom context, tools, and model. Run in isolated context windows.
Where	`.claude/agents/.md` or `~/.claude/agents/.md`
Spawned	Via Task tool or auto-delegated based on `description` match.
Frontmatter	`name`, `description`, `tools`, `disallowedTools`, `model`, `maxTurns`, `skills`, `memory`, `isolation`
Built-in	Explore (fast, read-only), Plan (architecture), general-purpose (full tools)
Isolation	`isolation: worktree` creates a temporary git worktree for safe parallel work.

Our config (dream repo): Uses built-in agents (Explore, Plan, general-purpose) for task delegation. No custom agent files — the dream repo is a single-operator system.

Engineering repo: 16 custom agents across 5 worktree teams.

Agent	Model	Role
`orchestrator`	Opus	Plan composition, team coordination
`lead-developer`	Sonnet	Implementation, bug fixes
`algorithm-engineer`	Opus	Pure algorithms in `libs/agency/`
`ui-cdd-architect`	Opus	Component design
`code-reviewer`	Opus	Architecture compliance before PR
`security-guardian`	Opus	Vulnerability management
`e2e-runner`	Sonnet	Playwright E2E validation

Assembly line pattern: Intent (test-engineer writes spec first) → Explore (architect designs) → Build (developer implements) → Verify (reviewer audits). The builder never validates their own work.

Best practice: Opus for architecture decisions (deepest reasoning). Sonnet for execution (balanced speed/quality). Haiku for quick fixes. Match cognitive demand to model capability.

Settings

Fact	Detail
What	JSON config controlling permissions, hooks, plugins, environment, and sandbox.
Where	`.claude/settings.json` (team), `.claude/settings.local.json` (personal), `~/.claude/settings.json` (global)
Precedence	Managed (highest) → CLI flags → local → project → user (lowest)
Permissions	`allow`, `deny`, `ask` arrays with tool patterns: `"Bash(npm run )"`, `"Read(./src/*)"`

Our config:

Setting	Value	Why
`alwaysThinkingEnabled`	`true`	Architecture and multi-file refactors need thinking tokens. Cost justified by fewer rework cycles.
`enabledPlugins`	context7, feature-dev, code-review, frontend-design, typescript-lsp	context7 for live docs. feature-dev for guided implementation. typescript-lsp for type checking without builds.
Deny list	`Edit/Write to /home/wik/code/sm/**`	Engineering repo isolation. Dream repo reads but never writes to engineering.

Engineering repo: Granular permissions — allows Read test env files, Nx/git commands, task tools; denies production .env, secrets, git push origin main. Session overrides in settings.local.json for one-off debugging (Docker, database operations).

Best practice: Separate production constraints (settings.json) from session flexibility (settings.local.json). Deny dangerous actions explicitly. Allow with patterns, not blanket permissions.

MCP Servers

Fact	Detail
What	Model Context Protocol — open standard for connecting to external tools, databases, APIs.
Where	`.mcp.json` (team), `~/.claude.json` (personal)
Transports	`http` (cloud), `stdio` (local processes)
CLI	`claude mcp add`, `claude mcp list`, `claude mcp remove`

Our config: Perplexity (search), Playwright (browser automation). Engineering repo adds GitHub (PR/issue integration).

Best practice: Use .mcp.json for team-shared servers (check into git). Use environment variable expansion (${API_KEY}) for secrets. Keep server count low — each server adds context overhead.

Plugins

Fact	Detail
What	Packaged extensions containing skills, agents, hooks, MCP servers. Distributed via marketplace.
Where	Marketplace install or `--plugin-dir ./my-plugin` for local development
Structure	`.claude-plugin/plugin.json` manifest + `skills/`, `agents/`, `hooks/`, `.mcp.json`
Namespacing	Plugin skills accessed as `/plugin-name:skill-name`

Our config: context7 (live documentation search), feature-dev (guided implementation), code-review (PR review), frontend-design (UI generation), typescript-lsp (type checking without builds).

Engineering repo: github (PR/issue integration), ralph-loop (long-running autonomous execution), context7.

Best practice: Plugins are for capabilities you want across projects or shared with a team. If it's project-specific, use skills and hooks directly. Don't install plugins you don't actively use — each adds to context overhead.

Memory

Fact	Detail
What	Persistent context that survives across sessions. Auto-memory (Claude writes) + CLAUDE.md (human writes).
Where	`~/.claude/projects/<project-id>/memory/MEMORY.md` (auto), plus topic files in same directory
Loaded	`MEMORY.md` first 200 lines loaded every session. Topic files referenced from MEMORY.md.
Manage	`/memory` command to view/edit. Tell Claude "remember X" to save.

Our config: Auto-memory captures hard-won lessons (content flow failures, design verification failures, PRD hierarchy detection, skill execution failures). Topic files organize by pattern, not chronology.

Best practice: Keep MEMORY.md under 200 lines — move details to topic files. Save stable patterns confirmed across multiple interactions, not session-specific context. Delete memories that become outdated.

Enforcement Hierarchy

Rules, hooks, and skills form a progression. Strongest guarantee first.

Tier	Mechanism	Guarantee	Example
1	Hook (PreToolUse, exit 2)	Block before execution	Build blocker prevents `pnpm dev`
2	Hook (PostToolUse)	Catch and warn after	`mdx-validator.py` flags long headings
3	Rule (always loaded)	In context, agent should follow	`page-flow.md` defines content order
4	Skill (on demand)	Procedure with gates	`content-flow` audits a page
5	Memory	Soft pattern	"We use pnpm"

Engineering repo adds a sixth tier above all: Nx generators that produce correct code structurally. The hexagonal scaffold generator makes layering violations impossible — you can't import across layers if the file structure doesn't allow it.

Principle: Push enforcement up the hierarchy. If a rule matters, pair it with a hook. If a hook isn't enough, make a generator. Cognitive enforcement fails under load. Structural enforcement doesn't.

Two-Repo Config

We run two repos with different config profiles. Same concepts, different emphasis.

Aspect	Dream Repo (this)	Engineering Repo
Purpose	WHY + WHAT (strategy, specs, content)	HOW (implementation, shipping)
Rules	14 (content flow, MDX, design)	17 (architecture, testing, security, design)
Hooks	9 (content validation, build blocking)	21 (typecheck, resource guard, team activation, compaction backup)
Skills	16 (content, PRD, research, design)	29 (planning, generators, testing, team-specific)
Commands	23 (content pipeline, business dev)	28 (orchestration, git, team activation)
Agents	Built-in only (single operator)	16 custom (5 worktree teams, assembly line)
Plugins	context7, feature-dev, code-review, frontend-design, typescript-lsp	github, ralph-loop, context7

Cross-repo boundary: Dream repo denies Edit/Write to /home/wik/code/sm/**. Neither repo writes to the other's filesystem. Communication via Convex messages and Supabase measurements.

Scoring Pipeline

The scoring pipeline shows how rules, skills, and scripts work together. LLMs assign scores using calibration examples. node scripts/prioritise-prds.mjs does the math. The team that scores is never the team that computes the ranking.

This pattern maps to Anthropic's programmatic tool calling — same split (judgment separated from computation) at the API level. See Tools for the full mapping.

Best Practice Guide

Principles distilled from running Claude Code across two repos, 14 rules, 30 hooks, 45 skills, and 32 agents.

1. Structure over discipline. If it matters, enforce it structurally. Rules are suggestions. Hooks are guarantees. Generators make violations impossible.

2. Pair rules with hooks. Every rule that's been violated under cognitive load now has a hook. The pattern: rule defines the standard, hook catches violations. Neither works alone.

3. PostToolUse for validation, PreToolUse for blocking. You validate what changed (Post). You block what shouldn't happen (Pre). Don't use PreToolUse for validation — you don't know what will change yet.

4. Separate judgment from computation. LLMs score. Scripts rank. Agents draft. Validators review. The builder never validates their own work. This is commissioning applied to code.

5. Skills need evidence gates. "All pass" without line numbers is a red flag. Quote the evidence or don't claim the gate passed. A skill invoked but not followed creates false confidence.

6. Keep CLAUDE.md thin. Router, not repository. Import rules with @, don't inline them. If Claude ignores instructions, the file is too long.

7. Single source of truth. .ai/rules/ is real. .claude/rules/ is symlinks. Edit once, every agent sees the change. Never maintain two copies of the same rule.

8. Match model to task. Opus for architecture. Sonnet for execution. Haiku for quick fixes. Use model in agent/skill frontmatter to override per-task.

9. Isolate dangerous work. isolation: worktree for agents that might break things. Deny lists for repos you shouldn't write to. Separate settings.json (production) from settings.local.json (session flexibility).

10. Memory is for patterns, not sessions. Save stable conventions confirmed across multiple interactions. Delete outdated memories. Keep MEMORY.md under 200 lines.

Context

Plans — Task DAGs with phases, quality gates, mindsets, and token budgets
Tools — Tool use, programmatic tool calling, scoring pipeline, extended thinking
Config Architecture — Agent-agnostic setup, decision log, per-agent guides
Gemini CLI — The broad-analysis complement
AI Agent Config Standards — AAIF, Agent Skills spec, MCP adoption
Work Prioritisation — The deterministic scoring algorithm
Commissioning — What's specified, built, proven
Priorities — Pain x Demand x Edge x Trend x Conversion
Data Flow — The fuel for prediction models
Clean Architecture — Structure AI can navigate

Concepts​

CLAUDE.md​

Rules​

Hooks​

Skills​

Commands​

Agents​

Settings​

MCP Servers​

Plugins​

Memory​

Enforcement Hierarchy​

Two-Repo Config​

Scoring Pipeline​

Best Practice Guide​

Context​