Segment scene in unknown domains by efficiently fine-tuning Vision Foundation Models with Rein
Segment scene in unknown domains by efficiently fine-tuning Vision Foundation Models with Rein
Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation
arXiv paper abstract https://arxiv.org/abs/2312.04265
arXiv PDF paper https://arxiv.org/pdf/2312.04265.pdf
... first assess and harness various Vision Foundation Models (VFMs) in the context of Domain Generalized Semantic Segmentation (DGSS).
Driven by the motivation that Leveraging Stronger pre-trained models and Fewer trainable parameters for Superior generalizability, ... introduce a robust fine-tuning approach, namely Rein, to parameter-efficiently harness VFMs for DGSS.
Built upon a set of trainable tokens, each linked to distinct instances, Rein precisely refines and forwards the feature maps from each layer to the next layer within the backbone.
This process produces diverse refinements for different categories within a single image.
With fewer trainable parameters, Rein efficiently fine-tunes VFMs for DGSS tasks, surprisingly surpassing full parameter fine-tuning.
Extensive experiments across various settings demonstrate that Rein significantly outperforms state-of-the-art methods ...
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Comments