Section 01
From Imitation to Collaboration: CoSpec Redefines the Speculative Decoding Paradigm
CoSpec proposes a collaborative speculative decoding method that trains an arbitration strategy via reinforcement learning. When the draft model and target model diverge, it intelligently selects tokens that are more likely to lead to the correct answer, maintaining inference acceleration effects while outperforming the performance of a single target model. This method breaks the limitation of traditional speculative decoding which treats the target model as the sole authority, enabling collaboration rather than imitation between models.