Explain a fine-grained image classification result by searching image for class with INTR
Explain a fine-grained image classification result by searching image for class with INTR
A Simple Interpretable Transformer for Fine-Grained Image Classification and Analysis
arXiv paper abstract https://arxiv.org/abs/2311.04157
arXiv PDF paper https://arxiv.org/pdf/2311.04157.pdf
... present a novel usage of Transformers to make image classification interpretable.
Unlike mainstream classifiers that wait until the last fully-connected layer to incorporate class information to make predictions, ... investigate a proactive approach, asking each class to search for itself in an image.
... realize this idea via a Transformer encoder-decoder inspired by DEtection TRansformer (DETR).
... learn ``class-specific'' queries (one for each class) as input to the decoder, enabling each class to localize its patterns in an image via cross-attention.
... show that INTR intrinsically encourages each class to attend distinctively; the cross-attention weights thus provide a faithful interpretation of the prediction.
... INTR could identify different ``attributes'' of a class, making it particularly suitable for fine-grained classification and analysis, which ...
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Comments