Action recognition by turning videos into an image mosaic
Action recognition by turning videos into an image mosaic
An Image Classifier Can Suffice Video Understanding
arXiv paper abstract https://arxiv.org/abs/2106.14104
arXiv PDF paper https://arxiv.org/pdf/2106.14104.pdf
... casting the video recognition problem as an image recognition task.
... composes input frames into a super image to train an image classifier to fulfill the task of action recognition, in exactly the same way as classifying an image.
... demonstrating strong and promising performance on four public datasets including Kinetics400, Something-to-something (V2), MiT and Jester, using a recently developed vision transformer.
... also experiment with the prevalent ResNet image classifiers
... results on Kinetics400 are comparable to some of the best-performed CNN approaches based on spatio-temporal modeling. ...
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Comments