Section 01
[Introduction] EntropyInfer: An Entropy-Guided Adaptive Inference Framework for Large Models on Long Texts
Core Information
- Project Name: EntropyInfer (Entropy-Guided Adaptive Inference Framework for Large Models on Long Texts)
- Core Method: Dynamically identify rigid and dynamic attention heads via attention entropy, enabling head-level and segment-level adaptive computation allocation
- Main Results: Achieve a 2.39x end-to-end speedup on long texts with over 100,000 tokens, with minimal quality loss
- Source & Open Source: arXiv paper (published on June 8, 2026, link: http://arxiv.org/abs/2606.09508v1), code open-sourced at https://github.com/SHA-4096/EntropyInfer