Get summary video from a text search

morrislee
Apr 28, 2021
1 min read

GPT2MVS: Generative Pre-trained Transformer-2 for Multi-modal Video Summarization

arXiv paper abstract https://arxiv.org/abs/2104.12465v1

arXiv PDF paper https://arxiv.org/pdf/2104.12465v1.pdf

... a text-based query is considered as one of the main drivers of video summary generation, as it is user-defined. ... The proposed model consists of a contextualized video summary controller, multi-modal attention mechanisms, an interactive attention network, and a video summary generator. Based on the evaluation of the existing multi-modal video summarization benchmark, experimental results show that the proposed model is effective with the increase of +5.88% in accuracy and +4.06% increase of F1-score, compared with the state-of-the-art method.

Please like and share this post if you enjoyed it using the buttons at the bottom!

Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website

#ComputerVision #AINewsClips #AI #ML #ArtificialIntelligence #MachineLearning

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

Get summary video from a text search

Recent Posts

Комментарии