Section 01
[Introduction] Psychological Concept Neurons: Possibilities and Limitations of Manipulating Large Models' 'Personality'
Key Research Points
This study explores the connection between large language models (LLMs) and the psychological 'Big Five Personality' theory, with core findings as follows:
- LLMs internally have psychological concept neurons corresponding to the Big Five personality dimensions;
- Intervening on these neurons can causally change the model's internal representations (success rate for some targeted directions exceeds 80%);
- However, the transfer of internal representation manipulation to generative behavior has significant limitations (e.g., reduced effect, cross-trait spillover). The study provides key scientific basis for AI interpretability, alignment, and personality engineering.