Section 01
A New Method for LLM Activation Steering Based on Linear Optimal Control
Researchers found that large language models (LLMs) have local linearity in inter-layer dynamics. Based on this, they proposed a closed-loop activation steering method using Linear Quadratic Regulators (LQR). This method can intervene in model behavior during inference without fine-tuning, outperforms existing baselines in tasks like toxicity control and factuality adjustment, and has both theoretical guarantees and practical deployment value.