Tools
Which tool handles which modality?
The answer depends on the layer: what model generates the output, what framework orchestrates the pipeline, which MCP server connects it to external data, and which CLI drives it from the terminal.
Modality Stack Matrix
| Modality | Model | Framework | MCP | CLI |
|---|---|---|---|---|
| Text | Claude, GPT-4 | Claude Code, Cursor | Perplexity, Context7, GitHub | drmg, gh |
| Vision | Claude, GPT-4V, Gemini | Claude Code | Claude in Chrome | — |
| Voice | Whisper, ElevenLabs, Deepgram | Claude Code (pipeline) | — | — |
| Audio | Suno, Udio, ElevenLabs | — | — | — |
| Video | Runway, Kling, Sora | — | — | — |
| 3D | Tripo, Meshy, Rodin | — | — | — |
| Code | Claude, GPT-4, Gemini | Claude Code, Codex | GitHub, Supabase, Context7 | drmg, gh |
Reading the matrix: Empty cells are gaps — no current tool or no integration exists for that combination. For the full tool list per modality (every provider, quality ratings, use cases), see AI Modalities. For the MCP server adoption radar, see MCP Tools.
Three Layers
Understanding the layer stops you comparing incompatible things.
| Layer | What it is | Choose when |
|---|---|---|
| Model | Generates the output — text, image, speech, video, 3D | You're choosing what produces the capability |
| Framework | Orchestrates agents, tools, and pipelines | You're choosing how to coordinate and extend models |
| Protocol | Connects frameworks to external tools and data | You're choosing how agents and tools communicate |
A production setup uses all three: Claude Code (framework) + Claude (model) + MCP servers (protocol).
What's Here
MCP Servers
Protocol reference and server catalog. Twenty servers across six categories: filesystem, database, web, memory, APIs, execution. Start here to understand MCP before choosing servers.
MCP Tools
Adoption radar for curated MCP servers — which to use, which to trial, which to hold. Organized by team and includes token cost estimates, decision checklists, and governance protocol.
CLI Tools
Design patterns for agent-grade CLIs — structured I/O, dry-run safety, runtime introspection, input hardening. Use this to evaluate any CLI before wiring it to an agent.
Apps
AI-powered applications by modality.
Context
- AI Modalities — What each modality can do and which models provide it
- Models — LLM provider comparison (text modality depth)
- Agent Protocols — MCP, A2A, Verifiable Intent at the protocol layer
- Agents — When to use subagents and agent teams
Questions
Which cell in the modality matrix above represents the highest-value gap for your current workflow?
- If you had to wire one new modality into your agent stack this week, which row would deliver the most value — and what's blocking the MCP column from filling in?
- When the model IS the matrix (omnimodal, any-in to any-out), which layer becomes redundant — framework, protocol, or CLI?
- How do you decide when a direct API call beats an MCP server for the same capability?