Real-time unknown object detection using pre-training on large-scale datasets with YOLO-World
Real-time unknown object detection using pre-training on large-scale datasets with YOLO-World
YOLO-World: Real-Time Open-Vocabulary Object Detection
arXiv paper abstract https://arxiv.org/abs/2401.17270
arXiv PDF paper https://arxiv.org/pdf/2401.17270.pdf
The You Only Look Once (YOLO) series of detectors have established themselves as efficient and practical tools.
However, their reliance on predefined and trained object categories limits their applicability in open scenarios.
... introduce YOLO-World ... that enhances YOLO with open-vocabulary detection capabilities through vision-language modeling and pre-training on large-scale datasets.
... propose a new Re-parameterizable Vision-Language Path Aggregation Network (RepVL-PAN) and region-text contrastive loss to facilitate the interaction between visual and linguistic information.
... method excels in detecting a wide range of objects in a zero-shot manner with high efficiency.
... outperforms many state-of-the-art methods in terms of both accuracy and speed ... remarkable performance on several downstream tasks, including object detection and open-vocabulary instance segmentation.
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Comments