top of page

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

As an Amazon Associate I earn

from qualifying purchases

Writer's picturemorrislee

Real-time scene segmentation on mobile devices using feature pyramids with TopFormer

Real-time scene segmentation on mobile devices using feature pyramids with TopFormer


TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation



Although vision transformers (ViTs) have achieved great success ... the heavy computational cost hampers their applications to dense prediction tasks such as semantic segmentation on mobile devices.


... present a mobile-friendly architecture named Token Pyramid Vision Transformer (TopFormer).


... TopFormer takes Tokens from various scales as input to produce scale-aware semantic features, which are then injected into the corresponding tokens to augment the representation.


... significantly outperforms CNN- and ViT-based networks across several semantic segmentation datasets and achieves a good trade-off between accuracy and latency.


... TopFormer achieves 5% higher accuracy in mIoU than MobileNetV3 with lower latency on an ARM-based mobile device.


... tiny version of TopFormer achieves real-time inference on an ARM-based mobile device with competitive results. ...



Please like and share this post if you enjoyed it using the buttons at the bottom! Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact Web site with my other posts by category https://morrislee1234.wixsite.com/website LinkedIn https://www.linkedin.com/in/morris-lee-47877b7b #ComputerVision #Segmentation #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning


166 views0 comments

Comments


ClickBank paid link

bottom of page