Get 3D objects and poses in one RGB-D image when trained only with synthetic data with FSD
Get 3D objects and poses in one RGB-D image when trained only with synthetic data with FSD
FSD: Fast Self-Supervised Single RGB-D to Categorical 3D Objects
arXiv paper abstract https://arxiv.org/abs/2310.12974
arXiv PDF paper https://arxiv.org/pdf/2310.12974.pdf
Project page https://fsd6d.github.io
... address the challenging task of 3D object recognition without the reliance on real-world 3D labeled data.
... goal is to predict the 3D shape, size, and 6D pose of objects within a single RGB-D image, operating at the category level and eliminating the need for CAD models during inference.
... existing self-supervised methods ... often ... inefficiencies ... from non-end-to-end processing, reliance on separate models for different object categories, and slow surface extraction during the training of implicit reconstruction models
... proposed method leverages a multi-stage training pipeline, designed to efficiently transfer synthetic performance to the real-world domain.
... achieved through ... 2D and 3D supervised losses during the synthetic domain training, followed by the incorporation of 2D supervised and 3D self-supervised losses on real-world data in two additional learning stages.
... method ... overcomes the aforementioned limitations and outperforms existing self-supervised 6D pose and size estimation baselines ... while running in near real-time at 5 Hz.
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Commenti