Section 01
导读 / 主楼:DASH-KV: Asymmetric Hashing Accelerates Long-Context LLM Inference, Reducing Complexity from Quadratic to Linear
Introduction / Main Floor: DASH-KV: Asymmetric Hashing Accelerates Long-Context LLM Inference, Reducing Complexity from Quadratic to Linear
DASH-KV reframes the attention mechanism as an approximate nearest neighbor search via asymmetric deep hashing, achieving O(N) linear complexity while maintaining generation quality comparable to full attention.