Section 01
[Introduction] SigmaScale: Core Introduction to the LLM Compression Method Based on SVD and Learned Scaling Matrices
SigmaScale is a compression method for large language models (LLMs). Its core is to optimize compression based on truncated singular value decomposition (SVD) by learning auxiliary scaling matrices. Guided by activation-aware compression loss, it optimizes row and column scaling transformations, effectively reducing the intrinsic rank of weight matrices, and achieves efficient compression while maintaining model performance. This article will discuss it from aspects such as background, method, and experiments.