Detect unknown objects and attributes better by jointly training on both tasks with OvarNet

morrislee
Jan 24, 2023
1 min read

OvarNet: Towards Open-vocabulary Object Attribute Recognition

arXiv paper abstract https://arxiv.org/abs/2301.09506

arXiv PDF paper https://arxiv.org/pdf/2301.09506.pdf

... consider ... detecting objects and inferring their visual attributes in an image, even for those with no manual annotations provided at the training stage, resembling an open-vocabulary scenario.

... make the following contributions: (i) ... start with a naive two-stage approach for open-vocabulary object detection and attribute classification, termed CLIP-Attr.

The candidate objects are first proposed with an offline RPN and later classified for semantic category and attributes;

(ii) ... combine all available datasets and train with a federated strategy to finetune the CLIP model, aligning the visual representation with attributes ...

(iii) ... train a Faster-RCNN type model end-to-end with knowledge distillation, that performs class-agnostic object proposals and classification on semantic categories and attributes ...

(iv) ... show that recognition of semantic category and attributes ... largely outperform existing approaches that treat the two tasks independently, demonstrating strong generalization ability to novel attributes and categories.

Please like and share this post if you enjoyed it using the buttons at the bottom! Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact Web site with my other posts by category https://morrislee1234.wixsite.com/website LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b #ComputerVision #ObjectDetection #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

Detect unknown objects and attributes better by jointly training on both tasks with OvarNet

Recent Posts

Comments