1.5x faster vision transformers by using activation sparsity with SparseViT

morrislee
Mar 31, 2023
1 min read

SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer

arXiv paper abstract https://arxiv.org/abs/2303.17605

arXiv PDF paper https://arxiv.org/pdf/2303.17605.pdf

High-resolution images ... improved performance comes at the cost of growing computational complexity

... introduce SparseViT that revisits activation sparsity for recent window-based vision transformers (ViTs).

As window attentions are naturally batched over blocks, actual speedup with window activation pruning becomes possible: i.e., ~50% latency reduction with 60% sparsity.

Different layers should be assigned with different pruning ratios due to their diverse sensitivities and computational costs.

... introduce sparsity-aware adaptation and apply the evolutionary search to efficiently find the optimal layerwise sparsity configuration within the vast search space.

SparseViT achieves speedups of 1.5x, 1.4x, and 1.3x compared to its dense counterpart in monocular 3D object detection, 2D instance segmentation, and 2D semantic segmentation, respectively, with negligible to no loss of accuracy.

Please like and share this post if you enjoyed it using the buttons at the bottom!

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website

LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b

#ComputerVision #Transformers #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

1.5x faster vision transformers by using activation sparsity with SparseViT

Recent Posts

Comentarios