top of page

News to help your R&D in artificial intelligence, machine learning, robotics, computer vision, smart hardware

As an Amazon Associate I earn

from qualifying purchases

Writer's picturemorrislee

Find location in video matching a sentence with TAN

Find location in video matching a sentence with TAN


Temporal Alignment Networks for Long-term Video

arXiv paper abstract https://arxiv.org/abs/2204.02968



The objective ... is a temporal alignment network that ingests long term video sequences, and associated text sentences, in order to:


(1) determine if a sentence is alignable with the video; and (2) if it is alignable, then determine its alignment.


The challenge is to train such networks from large-scale datasets ... where the associated text sentences have significant noise, and are only weakly aligned when relevant.


... describe a novel co-training method that enables to denoise and train on raw instructional videos without using manual annotation, despite the considerable noise


... proposed model ... outperforms strong baselines (CLIP, MIL-NCE) on this alignment dataset by a significant margin


... apply the trained model in the zero-shot settings to multiple downstream video understanding tasks and achieve state-of-the-art results ...



Please like and share this post if you enjoyed it using the buttons at the bottom!


Stay up to date. Subscribe to my posts https://morrislee1234.wixsite.com/website/contact

Web site with my other posts by category https://morrislee1234.wixsite.com/website



67 views0 comments

Comments


ClickBank paid link

bottom of page