Multiscale Vision Transformers Outperform Prior Work
Multiscale Vision Transformers Outperform Prior Work
Multiscale Vision Transformers
arXiv paper abstract https://arxiv.org/abs/2104.11227
arXiv PDF paper https://arxiv.org/pdf/2104.11227.pdf
... Multiscale Transformers have several channel-resolution scale stages. Starting from the input resolution and a small channel dimension, the stages hierarchically expand the channel capacity while reducing the spatial resolution. This creates a multiscale pyramid of features with early layers operating at high spatial resolution to model simple low-level visual information, and deeper layers at spatially coarse, but complex, high-dimensional features. ... We evaluate ... for a variety of video recognition tasks where it outperforms concurrent vision transformers ...
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Comments