Section 01
[Main Floor/Introduction] MAVEN: A Multi-Stage Agentic Video Annotation Pipeline
MAVEN is an automated annotation system for video reasoning tasks, which converts raw videos into high-quality structured training data through a multi-stage agent collaboration pipeline. Its core advantages include support for domain adaptation (adapting to new domains without manual redesign), continuous quality improvement, and an efficient design of "one annotation, multi-task reuse". This system aims to address bottlenecks in video understanding annotation such as high labor costs, poor consistency, and limited scale, and has demonstrated significant results in the traffic video domain.