1. Meta 3D Gen
- Author
-
Bensadoun, Raphael, Monnier, Tom, Kleiman, Yanir, Kokkinos, Filippos, Siddiqui, Yawar, Kariya, Mahendra, Harosh, Omri, Shapovalov, Roman, Graham, Benjamin, Garreau, Emilien, Karnewar, Animesh, Cao, Ang, Azuri, Idan, Makarov, Iurii, Le, Eric-Tuan, Toisoul, Antoine, Novotny, David, Gafni, Oran, Neverova, Natalia, and Vedaldi, Andrea
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Artificial Intelligence ,Computer Science - Graphics ,Computer Science - Machine Learning - Abstract
We introduce Meta 3D Gen (3DGen), a new state-of-the-art, fast pipeline for text-to-3D asset generation. 3DGen offers 3D asset creation with high prompt fidelity and high-quality 3D shapes and textures in under a minute. It supports physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications. Additionally, 3DGen supports generative retexturing of previously generated (or artist-created) 3D shapes using additional textual inputs provided by the user. 3DGen integrates key technical components, Meta 3D AssetGen and Meta 3D TextureGen, that we developed for text-to-3D and text-to-texture generation, respectively. By combining their strengths, 3DGen represents 3D objects simultaneously in three ways: in view space, in volumetric space, and in UV (or texture) space. The integration of these two techniques achieves a win rate of 68% with respect to the single-stage model. We compare 3DGen to numerous industry baselines, and show that it outperforms them in terms of prompt fidelity and visual quality for complex textual prompts, while being significantly faster.
- Published
- 2024