Section 01
PocketLLM: Guide to Meta-Network-Driven Extreme Compression of Large Language Models
PocketLLM: Guide to Meta-Network-Driven Extreme Compression of Large Language Models
PocketLLM is a novel large language model compression method based on meta-networks. Its core is projecting model weights into a discrete latent space via an encoder and reconstructing them using a lightweight decoder, achieving a compression ratio of up to 10x with minimal accuracy loss. Proposed by authors such as Ye Tian and Chengcheng Wang, the paper was submitted in November 2025 and accepted by AAAI 2026 in March 2026, with the project open-sourced on GitHub. Its innovation lies in applying discrete latent representation technology to large model weight compression, providing a feasible solution for deploying large models on edge devices.