In image identify activity, box entities, and name their roles with CoFormer
In image identify activity, box entities, and name their roles with CoFormer
Collaborative Transformers for Grounded Situation Recognition
arXiv paper abstract https://arxiv.org/abs/2203.16518v1
arXiv PDF paper https://arxiv.org/pdf/2203.16518v1.pdf
Grounded situation recognition is the task of predicting the main activity, entities playing certain roles within the activity, and bounding-box groundings of the entities in the given image.
... introduce a novel approach where the two processes for activity classification and entity estimation are interactive and complementary.
... propose Collaborative Glance-Gaze TransFormer (CoFormer)
... Glance transformer predicts the main activity with the help of Gaze transformer that analyzes entities and their relations,
... Gaze transformer estimates the grounded entities by focusing only on the entities relevant to the activity predicted by Glance transformer.
... CoFormer achieves the state of the art in all evaluation metrics on the SWiG dataset. ...
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
コメント