Answering spatial questions about scene using image and 3D with ScanQA

morrislee
Dec 31, 2021
1 min read

ScanQA: 3D Question Answering for Spatial Scene Understanding

arXiv paper abstract https://arxiv.org/abs/2112.10482v1

arXiv PDF paper https://arxiv.org/pdf/2112.10482v1.pdf

We propose a new 3D spatial understanding task of 3D Question Answering (3D-QA).

In the 3D-QA task, models receive visual information from the entire 3D scene of the rich RGB-D indoor scan and answer the given textual questions about the 3D scene.

... propose a baseline model for 3D-QA, named ScanQA model, where the model learns a fused descriptor from 3D object proposals and encoded sentence embeddings.

... correlates the language expressions with the underlying geometric features of the 3D scan and facilitates the regression of 3D bounding boxes to determine described objects in textual questions.

... ScanQA is the first large-scale effort to perform object-grounded question-answering in 3D environments.

Please like and share this post if you enjoyed it using the buttons at the bottom!

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website

#ComputerVision #VisualQuestionAnswering #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

Answering spatial questions about scene using image and 3D with ScanQA

Recent Posts

Comments