In image identify activity, box entities, and name their roles with CoFormer

morrislee
Apr 5, 2022
1 min read

Collaborative Transformers for Grounded Situation Recognition

arXiv paper abstract https://arxiv.org/abs/2203.16518v1

arXiv PDF paper https://arxiv.org/pdf/2203.16518v1.pdf

GitHub https://github.com/jhcho99/coformer

Grounded situation recognition is the task of predicting the main activity, entities playing certain roles within the activity, and bounding-box groundings of the entities in the given image.

... introduce a novel approach where the two processes for activity classification and entity estimation are interactive and complementary.

... propose Collaborative Glance-Gaze TransFormer (CoFormer)

... Glance transformer predicts the main activity with the help of Gaze transformer that analyzes entities and their relations,

... Gaze transformer estimates the grounded entities by focusing only on the entities relevant to the activity predicted by Glance transformer.

... CoFormer achieves the state of the art in all evaluation metrics on the SWiG dataset. ...

Please like and share this post if you enjoyed it using the buttons at the bottom!

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

#ComputerVision #ObjectDetection #ActionRecognition #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

In image identify activity, box entities, and name their roles with CoFormer

Recent Posts

Comments