top of page

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

As an Amazon Associate I earn

from qualifying purchases

Segment 3D scene with unknown objects by train with foundation model text descriptions with 3D-OVS

Segment 3D scene with unknown objects by train with foundation model text descriptions with 3D-OVS


Weakly Supervised 3D Open-vocabulary Segmentation



Open-vocabulary segmentation of 3D scenes is a fundamental function of human perception and thus a crucial objective in computer vision research.


However, this task is heavily impeded by the lack of ... 3D open-vocabulary segmentation datasets for training robust and generalizable models.


... tackle the challenges in 3D open-vocabulary segmentation by exploiting pre-trained foundation models CLIP and DINO in a weakly supervised manner.


... given only the open-vocabulary text descriptions of the objects in a scene, ... distill the open-vocabulary multimodal knowledge and object reasoning capability of CLIP and DINO into a neural radiance field (NeRF), which effectively lifts 2D features into view-consistent 3D segmentation.


... approach is that it does not require any manual segmentation annotations for either the foundation models or the distillation process.


... outperforms fully supervised models trained with segmentation annotations in certain scenes, suggesting that 3D open-vocabulary segmentation can be effectively learned from 2D images and text-image pairs ...



Please like and share this post if you enjoyed it using the buttons at the bottom!


Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact 

Web site with my other posts by category https://morrislee1234.wixsite.com/website 



50 views0 comments

ClickBank paid link

bottom of page