Improved segmenting of objects in a video that are mentioned in a text query with ReferFormer

morrislee
Jan 6, 2022
1 min read

Language as Queries for Referring Video Object Segmentation

arXiv paper abstract https://arxiv.org/abs/2201.00487v1

arXiv PDF paper https://arxiv.org/pdf/2201.00487v1.pdf

GitHub https://github.com/wjn922/referformer

Referring video object segmentation (R-VOS) is an emerging cross-modal task that aims to segment the target object referred by a language expression in all video frames.

... propose a simple and unified framework built upon Transformer, termed ReferFormer.

It views the language as queries and directly attends to the most relevant regions in the video frames.

... all the queries are obligated to find the referred objects only.

... The object tracking is achieved naturally by linking the corresponding queries across frames.

... On Ref-Youtube-VOS, Refer-Former ... exceeds the previous state-of-the-art performance by 8.4 points. ...

Please like and share this post if you enjoyed it using the buttons at the bottom!

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website

#ComputerVision #Segmentation #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

Improved segmenting of objects in a video that are mentioned in a text query with ReferFormer

Recent Posts

Comments