Answering spatial questions about scene using image and 3D with ScanQA
Answering spatial questions about scene using image and 3D with ScanQA
ScanQA: 3D Question Answering for Spatial Scene Understanding
arXiv paper abstract https://arxiv.org/abs/2112.10482v1
arXiv PDF paper https://arxiv.org/pdf/2112.10482v1.pdf
We propose a new 3D spatial understanding task of 3D Question Answering (3D-QA).
In the 3D-QA task, models receive visual information from the entire 3D scene of the rich RGB-D indoor scan and answer the given textual questions about the 3D scene.
... propose a baseline model for 3D-QA, named ScanQA model, where the model learns a fused descriptor from 3D object proposals and encoded sentence embeddings.
... correlates the language expressions with the underlying geometric features of the 3D scan and facilitates the regression of 3D bounding boxes to determine described objects in textual questions.
... ScanQA is the first large-scale effort to perform object-grounded question-answering in 3D environments.
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Comments