Section 01
DuQuant++: A New Fine-Grained Rotational Quantization Method to Solve MXFP4 Activation Outliers (Introduction)
Researchers propose the DuQuant++ method to address the activation outlier problem in the MXFP4 format. Using single-round outlier-aware rotation, it achieves more efficient W4A4 quantization, reaches SOTA performance on the LLaMA-3 model, halves online computation cost, and is compatible with the NVIDIA Blackwell architecture.