Segment untrained objects from text descriptions using text-to-image diffusion with Peekaboo

morrislee
Jun 22, 2023
1 min read

Peekaboo: Text to Image Diffusion Models are Zero-Shot Segmentors

arXiv paper abstract https://arxiv.org/abs/2211.13224

arXiv PDF paper https://arxiv.org/pdf/2211.13224.pdf

GitHub https://github.com/RyannDaGreat/Peekaboo

Project page https://ryanndagreat.github.io/peekaboo

Recently, text-to-image diffusion models have shown remarkable capabilities in creating realistic images from natural language prompts.

However, few works have explored using these models for semantic localization or grounding.

... explore how an off-the-shelf text-to-image diffusion model, trained without exposure to localization information, can ground various semantic phrases without segmentation-specific re-training.

... introduce an inference time optimization process capable of generating segmentation masks conditioned on natural language prompts.

... Peekaboo, is a first-of-its-kind zero-shot, open-vocabulary, unsupervised semantic grounding technique leveraging diffusion models without any training.

... evaluate Peekaboo on the Pascal VOC dataset for unsupervised semantic segmentation and the RefCOCO dataset for referring segmentation, showing results competitive with promising results ...

Please like and share this post if you enjoyed it using the buttons at the bottom!

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

#ComputerVision #Segmentation #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

Segment untrained objects from text descriptions using text-to-image diffusion with Peekaboo

Recent Posts

Comments