Section 01
DOSE: An Innovative Method for Screening High-Quality Multimodal Data Without Training [Main Floor Guide]
DOSE proposes a new method for screening multimodal training data using off-the-shelf pre-trained models (no fine-tuning on target data required). By constructing a joint quality-alignment distribution and adopting an adaptive weighted sampling strategy, it selects information-rich samples while maintaining long-tail diversity, enabling models to achieve or surpass the performance of those trained with full data on VQA and math benchmarks.