Section 01
Introduction / Main Floor: BigSmall: Lossless Neural Network Weight Compression, Enabling Large Models to Run Smoothly on Small Memory
BigSmall reduces the size of large language models by 65-83% using lossless compression technology. Combined with a streaming loader, it achieves peak memory usage below 2GB, allowing users to run complete models on consumer-grade hardware without quantization.