top of page

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

As an Amazon Associate I earn

from qualifying purchases

Writer's picturemorrislee

3D object detection boxes directly from image and point data using multi-modal features with CMT

3D object detection boxes directly from image and point data using multi-modal features with CMT


Cross Modal Transformer: Towards Fast and Robust 3D Object Detection

arXiv paper abstract https://arxiv.org/abs/2301.01283



... propose a robust 3D detector, named Cross Modal Transformer (CMT), for end-to-end 3D multi-modal detection.


Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes.


The spatial alignment of multi-modal tokens is performed by encoding the 3D points into multi-modal features.


The core design of CMT is quite simple while its performance is impressive.


It achieves 74.1% NDS (state-of-the-art with single model) on nuScenes test set while maintaining faster inference speed.


Moreover, CMT has a strong robustness even if the LiDAR is missing ...



Please like and share this post if you enjoyed it using the buttons at the bottom!


Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website



52 views0 comments

Comments


ClickBank paid link

bottom of page