Text-to-3D, image-to-3D, and 3D reconstruction.
Capability Matrix
| Provider | Input | Output | Quality | Use Case |
|---|
| Tripo AI | Image/Text | Mesh | Good | Quick prototypes |
| Meshy | Image/Text | Mesh | Good | Game assets |
| Rodin | Image | Mesh | Very Good | Characters |
| Luma Genie | Text | 3D | Good | Concepts |
| NeRF | Images | Radiance field | Excellent | Reconstruction |
Stages of Maturity
| Capability | Stage | Notes |
|---|
| Single object from image | Maturing | Production-viable |
| Text-to-3D objects | Early | Quality improving |
| Scene reconstruction | Maturing | NeRF, Gaussian splatting |
| Animated characters | Emerging | Limited options |
| Full environments | Early | Not yet reliable |
| Need | 1st Choice | 2nd Choice | Why |
|---|
| Text/image to 3D | Tripo | Meshy | Speed vs quality |
| Game-ready assets | Rodin | — | Topology optimization |
| Scene reconstruction | NeRF / Gaussian Splatting | — | Photorealistic capture |
Context
Questions
When single-object-from-image hits production quality — which industry gets disrupted first?
- At what point does AI-generated 3D replace hand-modeled assets for game studios — cost threshold, quality threshold, or both?
- What separates "good enough for a prototype" from "good enough for production" in 3D generation?
- How does Gaussian splatting change the economics of 3D capture compared to traditional photogrammetry?