Get 3D scene from monocular video without camera pose using frozen depth models with FrozenRecon
Get 3D scene from monocular video without camera pose using frozen depth models with FrozenRecon
FrozenRecon: Pose-free 3D Scene Reconstruction with Frozen Depth Models
arXiv paper abstract https://arxiv.org/abs/2308.05733
arXiv PDF paper https://arxiv.org/pdf/2308.05733.pdf
Project page https://aim-uofa.github.io/FrozenRecon
3D scene reconstruction is a long-standing vision task.
... learning-based methods ... learning 2D or 3D representation directly. However, without a large-scale video or 3D training data, it can hardly generalize to diverse real-world scenarios
... monocular depth estimation models ... possess weak 3D geometry prior, but they are insufficient for reconstruction due to the unknown camera parameters, the affine-invariant property, and inter-frame inconsistency.
... propose a novel test-time optimization approach that can transfer the robustness of affine-invariant depth models such as LeReS to challenging diverse scenes while ensuring inter-frame consistency, with only dozens of parameters to optimize per video frame.
... approach involves freezing the pre-trained affine-invariant depth model's depth predictions, rectifying them by optimizing the unknown scale-shift values with a geometric consistency alignment module, and employing the resulting scale-consistent depth maps to robustly obtain camera poses and achieve dense scene reconstruction, even in low-texture regions.
... method achieves state-of-the-art cross-dataset reconstruction on five zero-shot testing datasets.
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Comments