Segment objects in videos by generating an auxiliary frame between adjacent frames with Chen
Segment objects in videos by generating an auxiliary frame between adjacent frames with Chen
Space-time Reinforcement Network for Video Object Segmentation
arXiv paper abstract https://arxiv.org/abs/2405.04042
arXiv PDF paper https://arxiv.org/pdf/2405.04042
Recently, video object segmentation (VOS) networks typically use memory-based methods: for each query frame, the mask is predicted by space-time matching to memory frames.
Despite these methods having superior performance, they suffer from two issues: 1) Challenging data can destroy the space-time coherence between adjacent video frames.
2) Pixel-level matching will lead to undesired mismatching caused by the noises or distractors.
... first propose to generate an auxiliary frame between adjacent frames, serving as an implicit short-temporal reference for the query one.
... learn a prototype for each video object and prototype-level matching can be implemented between the query and memory.
... network outperforms the state-of-the-art method ... network exhibits a high inference speed of 32+ FPS.
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Comentários