Unsupervised multi-label image classification using many image snippet embeddings with Abdelfattah

morrislee
Aug 1, 2023
1 min read

CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification

arXiv paper abstract https://arxiv.org/abs/2307.16634

arXiv PDF paper https://arxiv.org/pdf/2307.16634.pdf

... presents a CLIP-based unsupervised learning method for annotation-free multi-label image classification, including three stages: initialization, training, and inference.

At the initialization stage, ... take full advantage of the powerful CLIP model and propose a novel approach to extend CLIP for multi-label predictions based on global-local image-text similarity aggregation.

... split each image into snippets and leverage CLIP to generate the similarity vector for the whole image (global) as well as each snippet (local). Then a similarity aggregator is introduced to leverage the global and local similarity vectors.

... propose an optimization framework to train the parameters of the classification network and refine pseudo labels for unobserved labels.

During inference, only the classification network is used to predict the labels of the input image.

... method outperforms state-of-the-art unsupervised methods ... and even achieves comparable results to weakly supervised classification methods.

Please like and share this post if you enjoyed it using the buttons at the bottom!

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

#ComputerVision #ObjectDetection #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

Unsupervised multi-label image classification using many image snippet embeddings with Abdelfattah

Recent Posts

Comments