Segment scene with unknown objects by enhance localization capabilities of CLIP with NACLIP

morrislee
Apr 15, 2024
1 min read

Pay Attention to Your Neighbours: Training-Free Open-Vocabulary Semantic Segmentation

arXiv paper abstract https://arxiv.org/abs/2404.08181

arXiv PDF paper https://arxiv.org/pdf/2404.08181.pdf

GitHub https://github.com/sinahmr/NACLIP

... vision-language ... models, such as CLIP, have ... effectiveness in ... zero-shot image-level tasks ... work has investigated ... these models in open-vocabulary semantic segmentation (OVSS).

However, existing approaches often rely on impractical supervised pre-training or access to additional pre-trained networks.

... propose a strong baseline for training-free OVSS, termed Neighbour-Aware CLIP (NACLIP), representing a straightforward adaptation of CLIP tailored for this scenario.

... enforces localization of patches in the self-attention of CLIP's vision transformer which, despite being crucial for dense prediction tasks, has been overlooked in the OVSS literature.

By ... choices favouring segmentation, ... improves performance without ... additional data, auxiliary pre-trained networks, or extensive hyperparameter tuning

... Experiments are performed on 8 popular semantic segmentation benchmarks, yielding state-of-the-art performance on most scenarios ...

Please like and share this post if you enjoyed it using the buttons at the bottom!

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

#ComputerVision #Segmentation #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

Segment scene with unknown objects by enhance localization capabilities of CLIP with NACLIP

Recent Posts

Comments