Section 01
【Introduction】InternVideo: Core Introduction to the Open-Source Video Foundation Model Series
InternVideo is an open-source video foundation model series developed by the General Vision Team (OpenGVLab) of Shanghai Artificial Intelligence Laboratory, focusing on video understanding, multimodal learning, and large-scale video data processing, with excellent performance in multiple video understanding benchmark tests. Published in 2024 and accepted by ECCV 2024, this project provides complete model architecture, pre-trained weights, data processing tools, and downstream task support, making it one of the latest advances in the field of video multimodal learning.