Section 01
V-CAST: Curvature-Aware Spatiotemporal Pruning Technology—A New Path for Efficient Video Large Models
V-CAST: Curvature-Aware Spatiotemporal Pruning Technology
V-CAST is an innovative pruning method for video large language models, designed to address the computational efficiency challenges posed by the spatiotemporal characteristics of video data. By identifying key spatiotemporal regions through a curvature-aware mechanism, it significantly reduces computational costs while maintaining model performance, providing a feasible path for real-time video understanding applications. Its core lies in a three-layer collaborative pruning architecture, combining lightweight curvature calculation and dynamic strategies, with excellent experimentally verified results.