Count objects in image with point prompts from object localization and CLIP to identify with PseCo
Count objects in image with point prompts from object localization and CLIP to identify with PseCo
Point, Segment and Count: A Generalized Framework for Object Counting
arXiv paper abstract https://arxiv.org/abs/2311.12386
arXiv PDF paper https://arxiv.org/pdf/2311.12386.pdf
Class-agnostic object counting aims to count all objects in an image with respect to example boxes or class names, a.k.a few-shot and zero-shot counting. Current state-of-the-art methods highly rely on density maps to predict object counts, which lacks model interpretability.
... propose a generalized framework for both few-shot and zero-shot object counting based on detection.
... framework combines the superior advantages of two foundation models without compromising their zero-shot capability: (i) SAM to segment all possible objects as mask proposals, and (ii) CLIP to classify proposals to obtain accurate object counts.
... framework, termed PseCo, follows three steps: point, segment, and count ... propose a class-agnostic object localization to provide but least point prompts for SAM, which ... reduces computation ... avoids missing small objects.
... propose ... object classification that leverages CLIP image/text embeddings as the classifier, following a hierarchical knowledge distillation to obtain discriminative classifications among ... mask proposals.
... demonstrate that PseCo achieves state-of-the-art performance in both few-shot/zero-shot object counting/detection ...
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Comments