Segment object into parts unsupervised using knowledge inside stable diffusion using with EmerDiff
Segment object into parts unsupervised using knowledge inside stable diffusion using with EmerDiff
EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models
arXiv paper abstract https://arxiv.org/abs/2401.11739
arXiv PDF paper https://arxiv.org/pdf/2401.11739.pdf
Project page https://kmcode1.github.io/Projects/EmerDiff
Diffusion models have recently received increasing research attention for their remarkable transfer abilities in semantic segmentation tasks.
... generating fine-grained segmentation masks with diffusion models ... requires ... training on annotated datasets ... unclear to what extent pre-trained diffusion models ... understand the semantic relations
... leverage the semantic knowledge extracted from Stable Diffusion (SD) and aim to develop an image segmentor capable of generating fine-grained segmentation maps without any additional training.
... difficulty stems from ... semantically meaningful feature maps typically exist only in the spatially lower-dimensional layers, which poses a challenge in directly extracting pixel-level semantic relations
... framework identifies semantic correspondences between image pixels and spatial locations of low-dimensional feature maps by exploiting SD's generation process and utilizes them for constructing image-resolution segmentation maps.
... segmentation maps are demonstrated to be well delineated and capture detailed parts of the images, indicating the existence of highly accurate pixel-level semantic knowledge in diffusion models.
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Comments