Skip to main content

Large Language Models

Analysis | Diagrams | Thinkers

Who is winning the race to the bottom?

LLM VendorAll Purpose ModelMultiOpen SourceVoice/TTSImage GenTACO AgentCode AgentDeep ResearchStrengths
AnthropicClaude 3.7MCPTRUECreative and Socially Engaging, AI Coding
DeepSeekR1 V3TRUECheap, Scientific Research
ElevenLabsTurbo v2.5Voice cloning, TTS leader
GoogleGemini 2.0Chirp STTImagen-3Cheap, Rounded, In-depth option
MetaTRUE
Microsoft
Nous ResearchTRUEDecentralized
OpenAIGPT-4oTTS-1-HD, WhisperDALL-E3Operator
OpenAI o3o3
Perplexity
QwenQwen 2.5TRUEQwen3-TTSOpen source, multilingual
VeniceTRUE
XAIAurora

See analysis workbook

  • Multi Modal: Can interpret voice and images.
  • Voice/TTS: Text-to-Speech, Speech-to-Text capabilities
  • Image Gen: Graphic Design
  • Strengths: Strongest Use Cases
  • TACO Agent:
  • Coding Agent:
  • Deep Research: available or not

See AI Modalities for detailed breakdown of capability types (voice, vision, video, audio, 3D).

Model Selection

Constantly review Minimum Viable Toolkit to gain maximum leverage by focusing on one critical job to be done at a time.

  • Identify a recurring need
  • Search for the best tool
    • Cost
    • Speed
    • Accuracy
  • Master functionality
  • Glue to workflows

If the tool does not exist, investigate building it.

Subject Expertise

Context