Vision transformer morphed to CNN works better
Vision transformer morphed to CNN works better
Visformer: The Vision-friendly Transformer
arXiv paper PDF https://arxiv.org/abs/2104.12533
arXiv PDF paper https://arxiv.org/pdf/2104.12533.pdf
... rapid development ... Transformer module to vision ...
... there are still growing number of evidences showing that these models suffer over-fitting especially when the training data is limited.
... gradually transit a Transformer-based model to a convolution-based model.
... With the same computational complexity, Visformer outperforms both the Transformer-based and convolution-based models in terms of ImageNet classification accuracy, and the advantage becomes more significant when the model complexity is lower or the training set is smaller. ...
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Commentaires