Segment untrained objects by grouping visual features and enhancing descriptions with O3S

morrislee
Jul 6, 2023
1 min read

Multi-Modal Prototypes for Open-Set Semantic Segmentation

arXiv paper abstract https://arxiv.org/abs/2307.02003

arXiv PDF paper https://arxiv.org/pdf/2307.02003.pdf

In semantic segmentation, adapting a visual system to novel object categories at inference time has always been both valuable and challenging.

... existing methods rely on ... support examples as visual cues or class names as textual cues ... these ... two ... studied in isolation, neglecting the complementary intrinsic of low-level visual and high-level language

... define ... open-set semantic segmentation (O3S), which aims to learn seen and unseen semantics from both visual examples and textual names.

... extracts multi-modal prototypes for segmentation task, by first single modal self-enhancement and aggregation, then multi-modal complementary fusion.

... aggregate visual features into several tokens as visual prototypes, and enhance the class name with detailed descriptions for textual prototype generation. The two modalities are then fused to generate multi-modal prototypes

... State-of-the-art results are achieved even on more detailed part-segmentation, Pascal-Animals, by only training on coarse-grained datasets ...

Please like and share this post if you enjoyed it using the buttons at the bottom!

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

#ComputerVision #Segmentation #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

Segment untrained objects by grouping visual features and enhancing descriptions with O3S

Recent Posts

Comments