Section 01
[Main Floor] BrainInsideTheMachine: Guide to the Study on Mechanistic Interpretability of Transformer Multilingual Reasoning
BrainInsideTheMachine is an open-source research project that deeply explores the internal working mechanisms of Transformer models in multilingual reasoning tasks through over 170 causal intervention experiments, covering 4 model families. The project focuses on mechanistic interpretability, aiming to open the black box of LLMs and understand internal computational mechanisms (such as the roles of neurons, attention heads, and layers). It uses causal analysis methods like activation patching and ablation, and all experimental code and data are fully open.