Skip to main content

3D Generation

Text-to-3D, image-to-3D, and 3D reconstruction.

Capability Matrix

ProviderInputOutputQualityUse Case
Tripo AIImage/TextMeshGoodQuick prototypes
MeshyImage/TextMeshGoodGame assets
RodinImageMeshVery GoodCharacters
Luma GenieText3DGoodConcepts
NeRFImagesRadiance fieldExcellentReconstruction

Stages of Maturity

CapabilityStageNotes
Single object from imageMaturingProduction-viable
Text-to-3D objectsEarlyQuality improving
Scene reconstructionMaturingNeRF, Gaussian splatting
Animated charactersEmergingLimited options
Full environmentsEarlyNot yet reliable

Tool Selection

Need1st Choice2nd ChoiceWhy
Text/image to 3DTripoMeshySpeed vs quality
Game-ready assetsRodinTopology optimization
Scene reconstructionNeRF / Gaussian SplattingPhotorealistic capture

Stack

LayerTool
ModelTripo (speed), Meshy (game assets), Rodin (characters), NeRF (reconstruction)
Framework— (direct API; no framework integration yet)
MCP
CLI

3D generation has no framework or MCP integration today. Image-to-3D is the most mature input path — use vision models to preprocess images before passing to 3D tools.

Context

  • AI Modalities — All capability types
  • AI Tools — Where framework and MCP integrations exist
  • Vision — VLMs and image understanding

Questions

When single-object-from-image hits production quality — which industry gets disrupted first?

  • At what point does AI-generated 3D replace hand-modeled assets for game studios — cost threshold, quality threshold, or both?
  • What separates "good enough for a prototype" from "good enough for production" in 3D generation?
  • How does Gaussian splatting change the economics of 3D capture compared to traditional photogrammetry?