top of page

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

As an Amazon Associate I earn

from qualifying purchases

Writer's picturemorrislee

One model for 2D and 3D perception using positional encodings of 2D and 3D tokens with ODIN

One model for 2D and 3D perception using positional encodings of 2D and 3D tokens with ODIN


ODIN: A Single Model for 2D and 3D Perception



... models on contemporary 3D perception ... consume and label dataset-provided 3D point clouds, obtained through post processing of sensed multiview RGB-D images.


They are typically trained in-domain, forego large-scale 2D pre-training and outperform alternatives that featurize the posed RGB-D multiview images instead.


... propose ODIN (Omni-Dimensional INstance segmentation), a model that can segment and label both 2D RGB images and 3D point clouds, using a transformer architecture that alternates between 2D within-view and 3D cross-view information fusion.


... model differentiates 2D and 3D feature operations through the positional encodings of the tokens involved


... ODIN achieves state-of-the-art performance on ... 3D instance segmentation benchmarks, and competitive performance on ScanNet, S3DIS and COCO.


It outperforms all previous works by a wide margin when the sensed 3D point cloud is used in place of the point cloud sampled from 3D mesh ...



Please like and share this post if you enjoyed it using the buttons at the bottom!


Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact 

Web site with my other posts by category https://morrislee1234.wixsite.com/website 



46 views0 comments

Comments


ClickBank paid link

bottom of page