Section 01
[Main Floor] SpecGuard: A New Framework Balancing Large Model Reasoning Acceleration and Accuracy
SpecGuard is a verification-aware speculative decoding framework. Its core innovation lies in a step-level verification mechanism that relies on internal model signals (attention grounding score + log probability confidence) without requiring external components. Compared to traditional speculative decoding, it increases reasoning accuracy by 3.6% and reduces latency by approximately 11%, solving the error accumulation problem caused by traditional token-level verification.