Section 01
Core Introduction to the SpecBlock Framework: A Block-iterative Solution to the Dilemma of Speculative Decoding
Title: SpecBlock: Block-iterative Speculative Decoding Combining Path Dependency and Low-cost Drafting
This paper proposes the SpecBlock framework, aiming to solve the dilemma between high cost of autoregressive draftors and high rejection rate of parallel draftors in speculative decoding technology. Through a block-iterative drafting mechanism and dynamic tree construction strategy, the framework significantly reduces drafting costs while maintaining path dependency. Experiments show that compared to EAGLE-3, SpecBlock achieves an 8-13% speedup with only 44-52% of the drafting cost; when cost-aware adaptation is enabled, the advantage further expands to 11-19%.