Segment scene with unknown objects by transferable sparse models and efficient tuning with OpenTrans
Segment scene with unknown objects by transferable sparse models and efficient tuning with OpenTrans
Transferable and Principled Efficiency for Open-Vocabulary Segmentation
arXiv paper abstract https://arxiv.org/abs/2404.07448
arXiv PDF paper https://arxiv.org/pdf/2404.07448.pdf
Project page https://github.com/Xujxyang/OpenTrans
... pre-trained foundation vision-language models makes Open-Vocabulary Segmentation (OVS) possible.
... this approach introduces heavy computational overheads for two challenges: 1) large model sizes of the backbone; 2) expensive costs during the fine-tuning.
... Although traditional methods such as model compression and efficient fine-tuning can address these challenges, they often rely on heuristics.
... target achieving performance that is ... better than prior OVS works based on large vision-language foundation models, by utilizing smaller models that incur lower training costs.
... make ... efficiency principled and thus seamlessly transferable from one OVS framework to others without further customization.
... demonstrate ... superior trade-off between segmentation accuracy and computation costs over previous works ...
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
ความคิดเห็น