Segment unknown scene with unsupervised learning using UNet diffusion features with DiffCut
Segment unknown scene with unsupervised learning using UNet diffusion features with DiffCut
Zero-Shot Image Segmentation via Recursive Normalized Cut on Diffusion Features
arXiv paper abstract https://arxiv.org/abs/2406.02842
arXiv PDF paper https://arxiv.org/pdf/2406.02842
Project page https://diffcut-segmentation.github.io
Foundation models have emerged as powerful tools across various domains including language, vision, and multimodal tasks.
While prior works have addressed unsupervised image segmentation, they significantly lag behind supervised models.
... use a diffusion UNet encoder as ... vision encoder and introduce DiffCut, an unsupervised zero-shot segmentation method that solely harnesses the output features from the final self-attention block.
... demonstrate that the utilization of these diffusion features in a graph based segmentation algorithm, significantly outperforms ... state-of-the-art methods on zero-shot segmentation.
... leverage a recursive Normalized Cut algorithm that softly regulates the granularity of detected objects and produces ... segmentation maps that ... capture intricate image details.
... work highlights the remarkably accurate semantic knowledge embedded within diffusion UNet encoders that could then serve as foundation vision encoders for downstream tasks ...
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
תגובות