Section 01
BPDQ: A Breakthrough Method for High-Performance Inference of Large Models at 2-Bit Low Precision
The ICML 2026 accepted paper BPDQ proposes a variable quantization grid technology based on bit-plane decomposition, which is a breakthrough post-training quantization method. This method significantly outperforms traditional PTQ methods in 2-3 bit low-precision scenarios, achieving an 83.85% GSM8K accuracy for Qwen2.5-72B on a single RTX 3090, providing a new path for large model deployment in low-resource scenarios.