Skip to main content

Tools

Which tool handles which modality?

The answer depends on the layer: what model generates the output, what framework orchestrates the pipeline, which MCP server connects it to external data, and which CLI drives it from the terminal.

Modality Stack Matrix

ModalityModelFrameworkMCPCLI
TextClaude, GPT-4Claude Code, CursorPerplexity, Context7, GitHubdrmg, gh
VisionClaude, GPT-4V, GeminiClaude CodeClaude in Chrome
VoiceWhisper, ElevenLabs, DeepgramClaude Code (pipeline)
AudioSuno, Udio, ElevenLabs
VideoRunway, Kling, Sora
3DTripo, Meshy, Rodin
CodeClaude, GPT-4, GeminiClaude Code, CodexGitHub, Supabase, Context7drmg, gh

Reading the matrix: Empty cells are gaps — no current tool or no integration exists for that combination. For the full tool list per modality (every provider, quality ratings, use cases), see AI Modalities. For the MCP server adoption radar, see MCP Tools.

Three Layers

Understanding the layer stops you comparing incompatible things.

LayerWhat it isChoose when
ModelGenerates the output — text, image, speech, video, 3DYou're choosing what produces the capability
FrameworkOrchestrates agents, tools, and pipelinesYou're choosing how to coordinate and extend models
ProtocolConnects frameworks to external tools and dataYou're choosing how agents and tools communicate

A production setup uses all three: Claude Code (framework) + Claude (model) + MCP servers (protocol).

What's Here

MCP Servers

Protocol reference and server catalog. Twenty servers across six categories: filesystem, database, web, memory, APIs, execution. Start here to understand MCP before choosing servers.

MCP Tools

Adoption radar for curated MCP servers — which to use, which to trial, which to hold. Organized by team and includes token cost estimates, decision checklists, and governance protocol.

CLI Tools

Design patterns for agent-grade CLIs — structured I/O, dry-run safety, runtime introspection, input hardening. Use this to evaluate any CLI before wiring it to an agent.

Apps

AI-powered applications by modality.

Context

  • AI Modalities — What each modality can do and which models provide it
  • Models — LLM provider comparison (text modality depth)
  • Agent Protocols — MCP, A2A, Verifiable Intent at the protocol layer
  • Agents — When to use subagents and agent teams

Questions

Which cell in the modality matrix above represents the highest-value gap for your current workflow?

  • If you had to wire one new modality into your agent stack this week, which row would deliver the most value — and what's blocking the MCP column from filling in?
  • When the model IS the matrix (omnimodal, any-in to any-out), which layer becomes redundant — framework, protocol, or CLI?
  • How do you decide when a direct API call beats an MCP server for the same capability?