Detect new objects by using a model to automatically give more annotations with RAM

morrislee
Jun 7, 2023
1 min read

Recognize Anything: A Strong Image Tagging Model

arXiv paper abstract https://arxiv.org/abs/2306.03514

arXiv PDF paper https://arxiv.org/pdf/2306.03514.pdf

GitHub https://github.com/xinyu1205/Recognize_Anything-Tag2Text

Project page https://recognize-anything.github.io

... present the Recognize Anything Model (RAM): a strong foundation model for image tagging ... can recognize any common category with high accuracy.

RAM introduces a new paradigm for image tagging, leveraging large-scale image-text pairs for training instead of manual annotations.

... RAM comprises four key steps. Firstly, annotation-free image tags are obtained at scale through automatic text semantic parsing.

Subsequently, a preliminary model is trained for automatic annotation by unifying the caption and tagging tasks, supervised by the original texts and parsed tags, respectively.

Thirdly, a data engine is employed to generate additional annotations and clean incorrect ones. Lastly, the model is retrained with the processed data and fine-tuned using a smaller but higher-quality dataset.

... observe impressive zero-shot performance, significantly outperforming CLIP and BLIP ... surpasses the fully supervised manners and exhibits competitive performance with the Google API ...

Please like and share this post if you enjoyed it using the buttons at the bottom!

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

#ComputerVision #ObjectDetection #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

Detect new objects by using a model to automatically give more annotations with RAM

Recent Posts

Comments