Detect unknown objects and attributes better by jointly training on both tasks with OvarNet
Detect unknown objects and attributes better by jointly training on both tasks with OvarNet
OvarNet: Towards Open-vocabulary Object Attribute Recognition
arXiv paper abstract https://arxiv.org/abs/2301.09506
arXiv PDF paper https://arxiv.org/pdf/2301.09506.pdf
... consider ... detecting objects and inferring their visual attributes in an image, even for those with no manual annotations provided at the training stage, resembling an open-vocabulary scenario.
... make the following contributions: (i) ... start with a naive two-stage approach for open-vocabulary object detection and attribute classification, termed CLIP-Attr.
The candidate objects are first proposed with an offline RPN and later classified for semantic category and attributes;
(ii) ... combine all available datasets and train with a federated strategy to finetune the CLIP model, aligning the visual representation with attributes ...
(iii) ... train a Faster-RCNN type model end-to-end with knowledge distillation, that performs class-agnostic object proposals and classification on semantic categories and attributes ...
(iv) ... show that recognition of semantic category and attributes ... largely outperform existing approaches that treat the two tasks independently, demonstrating strong generalization ability to novel attributes and categories.
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b
#ComputerVision #ObjectDetection #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning
Comments