Segment objects in videos using only 2 labeled frames with Two-shot-Video-Object-Segmentation
Segment objects in videos using only 2 labeled frames with Two-shot-Video-Object-Segmentation
Two-shot Video Object Segmentation
arXiv paper abstract https://arxiv.org/abs/2303.12078
arXiv PDF paper https://arxiv.org/pdf/2303.12078.pdf
Previous works on video object segmentation (VOS) are trained on densely annotated videos ... acquiring annotations in pixel level is expensive and time-consuming.
... demonstrate ... two labeled frames per training video ... idea is to generate pseudo labels for unlabeled frames during training and to optimize the model on the combination of labeled and pseudo-labeled data ... approach ... can be applied to a majority of existing frameworks.
... first pre-train a VOS model on sparsely annotated videos in a semi-supervised manner, with the first frame always being a labeled one.
... adopt the pre-trained VOS model to generate pseudo labels for all unlabeled frames, which are subsequently stored in a pseudo-label bank.
... retrain a VOS model on both labeled and pseudo-labeled data without any restrictions on the first frame.
... approach achieves comparable results in contrast to the counterparts trained on fully labeled set ...
Please like and share this post if you enjoyed it using the buttons at the bottom!
Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact
Web site with my other posts by category https://morrislee1234.wixsite.com/website
Comments