Section 01
[Introduction] LMM-Track4D: Multimodal Large Model Empowers 4D Object Tracking and Trajectory Reasoning
The NeurIPS 2026 open-source project LMM-Track4D integrates large language models with multi-view vision to achieve end-to-end 4D object tracking and trajectory reasoning, opening up a new direction for multimodal spatiotemporal understanding. This project breaks through the limitations of traditional 3D detection and tracking, and endows the system with trajectory reasoning capabilities through a vision-language-geometry multimodal fusion architecture, which has broad application prospects in fields such as autonomous driving and robot navigation.