Scene segmentation 7.3 times faster with 3D transformer using patch attention

morrislee
Nov 3, 2021
1 min read

PatchFormer: A Versatile 3D Transformer Based on Patch Attention

arXiv paper abstract https://arxiv.org/abs/2111.00207

arXiv PDF paper https://arxiv.org/pdf/2111.00207.pdf

... 3D vision community ... shift from CNNs to ... pure Transformer architectures have attained top accuracy on the major 3D learning benchmarks.

... 3D Transformers ... has quadratic complexity (both in space and time) with respect to input size.

To solve ... introduce patch-attention to adaptively learn a much smaller set of bases upon which the attention maps are computed.

... patch-attention not only captures the global shape context but also achieves linear complexity to input size.

... propose a lightweight Multi-scale Attention (MSA) block to build attentions among features of different scales, providing the model with multi-scale features.

... network achieves strong accuracy on general 3D recognition tasks with 7.3x speed-up than previous 3D Transformers.

Please like and share this post if you enjoyed it using the buttons at the bottom!

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website

#ComputerVision #3D #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

Scene segmentation 7.3 times faster with 3D transformer using patch attention

Recent Posts

Comments