Section 01
BlockQuant: A New Block Vector Quantization Method Based on Spherical Geometry (Introduction)
Key Takeaways
- Unified theoretical analysis clarifies: The advantages of rotational quantization methods like EDEN and RabitQ are not absolute but depend on specific distortion criteria (e.g., MSE, inner product distortion, high-probability control).
- Proposes BlockQuant: More faithfully preserves the geometric structure of rotated embeddings via block-level spherical quantization, outperforming baselines like EDEN and RabitQ in both MSE and inner product distortion.
- Applicable scenarios: Long-context LLM inference (KV cache compression), vector database retrieval, edge device deployment, etc.