Section 01
SubFit: Introduction to the New Paradigm of LLM Compression at the Submodule Level
SubFit is a new paradigm for LLM compression at the submodule level. By breaking the full-layer granularity and continuous selection constraints of traditional hierarchical compression, it adopts submodule-level non-continuous selection and lightweight residual replacement strategies. At 25% sparsity, it retains 84.6% downstream accuracy, significantly outperforming traditional hierarchical compression methods and providing an efficient solution for large model deployment.
Basic Information:
- Original author team (arXiv submission)
- Source: arXiv, original title: From Layers to Submodules: Rethinking Granularity in Replacement-Based LLM Compression
- Release date: June 1, 2026
- Open-source code: https://github.com/eliacunegatti/SubFit
- Original link: http://arxiv.org/abs/2606.02559v1