Claude Code Tools
How does Claude Code connect judgment to computation?
Tool Use
Claude Code's power comes from tool calling — the agent reasons about what to do, then calls tools to execute.
| Fact | Detail |
|---|---|
| What | Claude selects and calls tools (Read, Write, Edit, Bash, Grep, Glob, Task, WebFetch) to accomplish tasks. |
| Pattern | Reasoning → tool selection → execution → result → next reasoning step. |
| Docs | Tool use |
input_examples
Concrete examples that anchor tool calling consistency. Instead of describing what a tool does abstractly, you show it real input/output pairs.
| Fact | Detail |
|---|---|
| What | Concrete input/output examples attached to tool definitions. Make LLM tool calling consistent across sessions. |
| Pattern | "Is this more like Pain=5 ($12K/month lost) or Pain=1 (calendar-could-be-better)?" |
| Docs | input_examples |
Our use: The score-prds skill uses calibration examples as input_examples — concrete PRD anchors at each score level so different agents produce convergent scores.
Programmatic Tool Calling
| Fact | Detail |
|---|---|
| What | Claude writes code that calls tools in a sandboxed container. Reduces round-trips between reasoning and execution. |
| Pattern | Instead of Claude → tool → result → Claude → tool → result, it becomes Claude → writes code that calls N tools → single result. |
| Sandbox | code_execution_20260120 tool runs Python in an isolated container with $TOOL_CALL function. |
| Docs | Programmatic tool calling |
How It Maps
| Layer | Manual (our system) | Programmatic |
|---|---|---|
| Who writes code | Human wrote prioritise-prds.mjs | Claude writes Python at runtime |
| Who calls it | Skill tells Claude to run node scripts/... | Claude's code calls tools via $TOOL_CALL |
| Execution | Local Node process | Sandboxed container |
| Determinism | Script does pure math, zero judgment | Code does pure computation, zero judgment |
| The split | LLM scores → script ranks | LLM reasons → code executes |
The core principle is identical: separate judgment from computation.
Where Each Wins
Local scripts (prioritise-prds.mjs): Version-controlled, testable, reviewable by humans, runs without API calls. Better when the computation is stable and reusable.
Programmatic tool calling: Better when computation needs to happen inside a Claude API session — agent platforms where orchestrators compose tool calls dynamically. Relevant for the Agent Platform PRD and BOaaS dispatch.
Scoring Pipeline
The scoring pipeline is the proof-of-concept for both patterns:
JUDGMENT (skill) MATH (script)
───────────────── ─────────────────
1. Skill loads calibration 4. Script reads frontmatter
examples (Pain=5 → $12K/mo) 5. Geometric mean, weighted sum
2. Agent reads PRD, scores 1-5 6. Gate checks (BLOCKED, NO PLAYERS)
3. Writes to frontmatter 7. Sort by final score + diff
| Layer | What | Deterministic? |
|---|---|---|
| Calibration | input_examples pattern anchors scoring | No — but constrained by concrete examples |
| Scores | 10 dimensions per PRD in frontmatter | No — human/LLM judgment |
| Algorithm | node scripts/prioritise-prds.mjs | Yes — pure math |
| Enforcement | Gate checks, dependency bonuses | Yes — binary pass/fail |
The script is the commissioning principle applied to code: the team that scores (LLM) is never the team that computes the ranking (script).
Extended Thinking
| Fact | Detail |
|---|---|
| What | Claude uses additional "thinking tokens" before responding — visible reasoning that improves complex tasks. |
| When | Architecture decisions, multi-file refactors, ambiguous requirements. |
| Config | alwaysThinkingEnabled: true in settings.json |
| Docs | Extended thinking |
Our config: Always on. Cost justified by fewer rework cycles on complex tasks.
Token-Efficient Tools
| Fact | Detail |
|---|---|
| What | Optimized tool result formats that reduce token consumption without losing information. |
| Pattern | Tools return structured, compact results instead of verbose output. |
| Docs | Token-efficient tools |
Relevant when building API-level agent orchestration — every token saved compounds across tool calls in long-running agent sessions.
Context
- Claude Code — Concepts, config, best practices
- Config Architecture — Agent-agnostic setup
- Work Prioritisation — The deterministic scoring algorithm
- Commissioning — What's specified, built, proven
- Agent Platform PRD — Where programmatic tool calling applies