Get 3D scene from monocular images using diffusion to generate RGBD samples for training with D4D
Get 3D scene from monocular images using diffusion to generate RGBD samples for training with D4D
D4D: An RGBD diffusion model to boost monocular depth estimation
arXiv paper abstract https://arxiv.org/abs/2403.07516
arXiv PDF paper https://arxiv.org/pdf/2403.07516.pdf
Ground-truth RGBD data are fundamental for a wide range of computer vision applications; however, those labeled samples are difficult to collect and time-consuming to produce.
A common solution to overcome this lack of data is to employ graphic engines to produce synthetic proxies; however, those data do not often reflect real-world images, resulting in poor performance of the trained models at the inference step.
... propose a novel training pipeline that incorporates Diffusion4D (D4D), a customized 4-channels diffusion model able to generate realistic RGBD samples.
... show the effectiveness of the developed solution in improving the performances of deep learning models on the monocular depth estimation task, where the correspondence between RGB and depth map is crucial to achieving accurate measurements.
... supervised training pipeline, enriched by the generated samples, outperforms synthetic and original data performances ...
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Comments