Recognize 3D objects when only trained on 2D image and text pairs with PointCLIP
Recognize 3D objects when only trained on 2D image and text pairs with PointCLIP
PointCLIP: Point Cloud Understanding by CLIP
arXiv paper abstract https://arxiv.org/abs/2112.02413v1
arXiv PDF paper https://arxiv.org/pdf/2112.02413v1.pdf
... explored that whether CLIP, pre-trained by large-scale image-text pairs in 2D, can be generalized to 3D recognition.
... proposing PointCLIP, which conducts alignment between CLIP-encoded point cloud and 3D category texts.
... encode a point cloud by projecting it into multi-view depth maps without rendering, and aggregate the view-wise zero-shot prediction to achieve knowledge transfer from 2D to 3D.
... By simple ensembling, PointCLIP boosts baseline's performance and even surpasses state-of-the-art models.
Therefore, PointCLIP is a promising alternative for effective 3D point cloud understanding via CLIP under low resource cost and data regime.
... experiments on widely-adopted ModelNet10, ModelNet40 and the challenging ScanObjectNN to demonstrate the effectiveness of PointCLIP. ...
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
#ComputerVision #3D #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning
Comments