Section 01
[Overview] VLM2VLA and Catastrophic Forgetting: Research on Knowledge Retention of Vision-Language Models in Autonomous Driving
This study focuses on the catastrophic forgetting problem of vision-language models (VLMs) during fine-tuning for autonomous driving. The core innovation is representing low-level driving actions as natural language descriptions instead of traditional numerical labels, and using LoRA for lightweight fine-tuning. This approach enables the model to acquire driving action prediction capabilities while effectively preserving its general reasoning, semantic understanding, and language abilities, providing a new idea for VLA model training in the autonomous driving field.