Section 01
Guide to the Deep Study on How Political Stances Affect the Reasoning Ability of Large Language Models
This study explores the changes in the reasoning ability of large language models after inducing left/right political stances through three methods: role-play prompting, activation steering, and LoRA fine-tuning. Key findings include: political alignment affects the quality of the model on neutral reasoning tasks; in value-laden tasks, the model tends to handle controversial topics with its aligned stance; and there exists a "collapse threshold" (when alignment intensity exceeds a threshold, reasoning ability drops off a cliff). The study also provides an interactive results browser to show details of the impact.