Text-to-3D using 2D Diffusion
DreamFusion is a text-to-3D system utilizing an AI-based text-to-image generative model known as Imagen. It uses a technique called Score Distillation Sampling (SDS) to create 3D scenes from text input. This method enables them to generate samples from a diffusion model by optimizing a loss function, allowing for the construction of scenes in any parameter space. They have employed a 3D scene parameterization like Neural Radiance Fields (NeRF) to generate a differentiable mapping. SDS produces fairly realistic scenes, however DreamFusion has incorporated further regularizers and optimization tactics to refine the geometry. The trained NeRFs boast high-quality surface geometry, depth and normal maps, and can be re-lit using a Lambertian shading model.