Train new object detector without bounding box annotations using captioned images
Train new object detector without bounding box annotations using captioned images
Towards Open Vocabulary Object Detection without Human-provided Bounding Boxes
arXiv paper abstract https://arxiv.org/abs/2111.09452
arXiv PDF paper https://arxiv.org/pdf/2111.09452.pdf
... in object detection, most existing methods are limited to a small set of object categories, due to the tremendous human effort needed for instance-level bounding-box annotation.
... recent open vocabulary and zero-shot detection methods attempt to detect object categories not seen during training.
... still rely on manually provided bounding-box annotations on a set of base classes.
... propose ... framework that can be trained without manually provided bounding-box annotations.
... by leveraging the localization ability of pre-trained vision-language models and generating pseudo bounding-box labels that can be used directly for training object detectors.
... outperforms the state-of-the-arts (SOTA) that are trained using human annotated bounding-boxes by 3% AP on COCO novel categories even though our training source is not equipped with manual bounding-box labels. ...
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Comments