Section 01
SOL: A New Self-Optimization Paradigm for Dynamic Resource Allocation in Large Language Models (Introduction)
SOL: A New Self-Optimization Paradigm for Dynamic Computational Resource Allocation in Large Language Models
Abstract: Self-Optimizing Language Models (SOL) propose a dynamic computational budget allocation mechanism. Through a lightweight policy network, it selects the optimal computational configuration for each token during decoding, achieving a Pareto-optimal improvement in inference efficiency and quality while keeping model parameters unchanged.
Key Points: SOL does not modify the weights of the base model. It introduces a policy network to dynamically adjust computational resources (attention sparsity, MLP pruning, quantization bit-width), solving the resource mismatch problem of static optimization.