Section 01
Introduction: DeltaPrompts Breaks the Zero-Increment Trap in Multimodal Distillation, Achieving 15% Performance Improvement
This article reveals that 69% of prompts in multimodal distillation are 'zero-increment' invalid samples. It proposes the DeltaPrompts dataset (containing 200,000 synthetic high-divergence reasoning problems) that selects high-value prompts via answer divergence, achieving a 15% performance improvement in benchmark tests. The core innovation lies in returning to the first principles of distillation, quantifying prompt value using answer divergence, and proactively generating targeted training data.