Section 01
DeMUL: Introduction to the New Video Moment Retrieval Method
DeMUL is a novel method for moment retrieval in video corpora, achieving accurate retrieval through decoupled multimodal modeling and unified localization techniques. Its core innovations include decoupled independent encoding and progressive fusion of visual and language modalities, a unified localization framework that jointly handles moment positions and content relevance, and optimized indexing and transfer for video corpora. It has achieved leading performance on multiple benchmark datasets such as ActivityNet, and can be applied to scenarios like video search and intelligent editing.