Segment 3D scene with unknown objects using NeRF and ranking with CLIP foundation model OV-NeRF

morrislee
Feb 8, 2024
1 min read

OV-NeRF: Open-vocabulary Neural Radiance Fields with Vision and Language Foundation Models for 3D Semantic Understanding

arXiv paper abstract https://arxiv.org/abs/2402.04648

arXiv PDF paper https://arxiv.org/pdf/2402.04648.pdf

The development of Neural Radiance Fields (NeRFs) has provided ... open-vocabulary 3D semantic perception ... However ... methods that extract semantics ... from Contrastive Language-Image Pretraining (CLIP) for ... learning encounter difficulties

... propose OV-NeRF, which exploits the potential of pre-trained vision and language foundation models to enhance semantic field learning through proposed single-view and cross-view strategies.

First, from the single-view perspective, ... introduce Region Semantic Ranking (RSR) regularization by leveraging 2D mask proposals derived from SAM to rectify the noisy semantics of each training view

... Second, from the cross-view perspective, ... propose a Cross-view Self-enhancement (CSE) strategy to address the challenge raised by view-inconsistent semantics.

Rather than invariably utilizing the 2D inconsistent semantics from CLIP, CSE leverages the 3D consistent semantics generated from the well-trained semantic field itself for semantic field training, aiming to ... enhance overall semantic consistency across different views.

... OV-NeRF outperforms current state-of-the-art methods ... approach exhibits consistent superior results across various CLIP configurations, further verifying its robustness.

Please like and share this post if you enjoyed it using the buttons at the bottom!

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

#ComputerVision #Segmentation #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

Segment 3D scene with unknown objects using NeRF and ranking with CLIP foundation model OV-NeRF

Recent Posts

Comments