Section 01
MultiSmolVLA: Enhancing VLA Model Robustness for Robots via Modality Dropout Training
EPFL's MultiSmolVLA project addresses the fragility of single-RGB VLA models in real-world robot scenarios. It combines the 4M-21 multi-modal encoder with SmolVLA and introduces a modality dropout training strategy to boost robustness against sensor failures, aiming to provide more reliable perception solutions for robot applications.