Section 01
EdgeRazor Framework Overview: An Efficient Solution for LLM Lightweighting on Edge Devices
The EdgeRazor framework, open-sourced by the Nanjing University team, enables efficient deployment of large language models (LLMs) on edge devices via mixed-precision quantization-aware distillation. It supports quantization precisions from 1.58-bit to 4-bit, achieving a maximum compression ratio of 7.03x on the Qwen3-0.6B model, effectively balancing extreme compression and model capability.