Segment image into open set of categories using cost from a hierarchical encoder with SED
Segment image into open set of categories using cost from a hierarchical encoder with SED
SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation
arXiv paper abstract https://arxiv.org/abs/2311.15537
arXiv PDF paper https://arxiv.org/pdf/2311.15537.pdf
Open-vocabulary semantic segmentation strives to distinguish pixels into different semantic groups from an open set of categories.
... propose a simple encoder-decoder, named SED, for open-vocabulary semantic segmentation, which comprises a hierarchical encoder-based cost map generation and a gradual fusion decoder with category early rejection.
... Compared to plain transformer, hierarchical backbone better captures local spatial information and has linear computational complexity with respect to input size.
... gradual fusion decoder employs a top-down structure to combine cost map and the feature maps of different backbone levels for segmentation.
To accelerate inference speed, ... introduce a category early rejection scheme in the decoder that rejects many no-existing categories at the early layer of decoder, resulting in at most 4.7 times acceleration without accuracy degradation.
Experiments ... demonstrates the efficacy of ... SED method ...
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Komentáře