Section 01
Introduction to Mistletoe: A Stealthy Acceleration Collapse Attack on Speculative Decoding
Mistletoe is a new stealthy attack method targeting speculative decoding. By exploiting the imperfect match between the draft model and the target model, it significantly reduces the draft token acceptance rate while maintaining output quality, thus collapsing the inference acceleration effect. This article will detail the background, method, effects, and security implications of this attack.