Get 3D scene from monocular images by distilling the knowledge in stable diffusion with GeoWizard
Get 3D scene from monocular images by distilling the knowledge in stable diffusion with GeoWizard
GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image
arXiv paper abstract https://arxiv.org/abs/2403.12013
arXiv PDF paper https://arxiv.org/pdf/2403.12013.pdf
Project page https://fuxiao0719.github.io/projects/geowizard
... introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes, e.g., depth and normals, from single images.
... demonstrate that generative models, as opposed to traditional discriminative models (e.g., CNNs and Transformers), can effectively address the inherently ill-posed problem.
... further show that leveraging diffusion priors can markedly improve generalization, detail preservation, and efficiency in resource usage.
... extend the original stable diffusion model to jointly predict depth and normal, allowing mutual information exchange and high consistency between the two representations.
... propose ... to segregate the complex data distribution of various scenes into distinct sub-distributions ... enables ... model to recognize different scene layouts, capturing 3D geometry with remarkable fidelity.
GeoWizard sets new benchmarks for zero-shot depth and normal prediction ... enhancing ... downstream applications such as 3D reconstruction, 2D content creation, and novel viewpoint synthesis.
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Comentários