Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors

Guocheng Qian, Jinjie Mai, Abdullah Hamdi, Jian Ren, Aliaksandr Siarohin, Bing Li, Hsin-Ying Lee, Ivan Skorokhodov, Peter Wonka, Sergey Tulyakov, Bernard Ghanem

January 2024

Preprint Code Project

Abstract

We present “Magic123”, a two-stage coarse-to-fine solution for high-quality, tex-tured 3D meshes generation from a single unposed image in the wild using both 2D and 3D priors. In the first stage, we optimize a coarse neural radiance field and focus on learning geometry. In the second stage, a memory-efficient differentiable mesh representation is adopted to yield a high-resolution mesh with a visually appealing texture. In both stages, the 3D content is learned through reference view supervision and novel views guided by both 2D and 3D diffusion priors. A tradeoff parameter between the 2D and 3D priors controls the exploration (more imaginative) and exploitation (more precise) of the generated geometry. We further leverage textual inversion to encourage consistent appearances across views. Monocular depth estimation is used to constrain the 3D reconstruction and avoid collapsed solutions, e.g. flat geometry. Our Magic123 approach outperforms prior image-to-3D techniques by a large margin, as demonstrated through extensive experiments on various real images in the wild and on synthetic benchmarks. Our code, models, and generated 3D assets are available at https://guochengqian.github.io/project/magic123/ .

Type

Conference paper

Publication

International Conference on Learning Representations, 2024

Generative Models 3D