top of page

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

As an Amazon Associate I earn

from qualifying purchases

Writer's picturemorrislee

Vision transformer beats CNN on mobile devices for accuracy and speed with ElasticViT

Vision transformer beats CNN on mobile devices for accuracy and speed with ElasticViT


ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices

arXiv paper abstract https://arxiv.org/abs/2303.09730



... designing lightweight and low-latency ViT models for diverse mobile devices remains a big challenge.


... propose ElasticViT, a two-stage NAS approach that trains a high-quality ViT supernet over a very large search space that supports a wide range of mobile devices, and then searches an optimal sub-network (subnet) for direct deployment.


... Complexity-aware sampling limits the FLOPs difference among the subnets sampled across adjacent training steps, while covering different-sized subnets in the search space.


Performance-aware sampling further selects subnets that have good accuracy, which can reduce gradient conflicts and improve supernet quality.


... discovered models, ElasticViT models, achieve top-1 accuracy ... without extra retraining, outperforming all prior CNNs and ViTs in terms of accuracy and latency.


... the first ViT models that surpass state-of-the-art CNNs with significantly lower latency on mobile devices.



Please like and share this post if you enjoyed it using the buttons at the bottom!


Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website



188 views0 comments

Comments


ClickBank paid link

bottom of page