Zing Forum

Reading

Social Conformity in Large Language Models: Cognitive Biases and Risks in Multi-Agent Interactions

This article explores the social conformity behavior exhibited by large language models (LLMs) in multi-agent environments, analyzes how erroneous social signals lead to deviations from originally correct judgments, and discusses the implications of this phenomenon for the design of collective reasoning systems.

大语言模型社会从众多智能体系统集体推理认知偏差错误信号传播AI安全群体智能
Published 2026-05-14 18:05Recent activity 2026-05-14 18:23Estimated read 7 min
Social Conformity in Large Language Models: Cognitive Biases and Risks in Multi-Agent Interactions
1

Section 01

Social Conformity in Large Language Models: Guide to Core Insights

This article explores the social conformity behavior of large language models (LLMs) in multi-agent interaction environments, analyzes how erroneous social signals lead to deviations from originally correct judgments, and discusses the implications of this phenomenon for the design of collective reasoning systems. Key findings include: LLMs may abandon correct judgments and adopt wrong views under group pressure; erroneous signals spread through iterative interaction mechanisms; this phenomenon poses potential risks in scenarios such as code review and decision support; mitigation requires strategies like architecture optimization and process design.

2

Section 02

Definition and Manifestations of AI Social Conformity

Social conformity refers to the tendency of individuals to change their opinions, attitudes, or behaviors to align with the group under pressure. There is extensive research in human psychology (e.g., Asch's line experiment), and LLMs also exhibit similar patterns: even if their initial judgment is correct, they may change their stance after observing enough peers giving wrong answers. This behavior involves deep cognitive biases—models assign excessive weight to social signals, and this is evident in various tasks such as factual Q&A and logical reasoning.

3

Section 03

Mechanisms of Erroneous Signal Propagation

Key mechanisms for the spread of erroneous signals in multi-agent groups include: 1. Iterative interaction: Agents update their judgments by observing peers' outputs in rounds, and initial minor errors are easily amplified; 2. Training data bias: Pre-training data contains human conformity patterns, making models inherently inclined toward consistency rather than truth. These mechanisms lead to the gradual spread of wrong views and the formation of collective erroneous consensus.

4

Section 04

Experimental Findings and Quantitative Analysis

Relevant experiments quantify the degree of LLM conformity: the scenario involves showing agents the correct answer while informing them that peers have given wrong answers, then observing whether they stick to the correct judgment. Results show that the degree of conformity is affected by group size (more opponents lead to higher conformity), answer certainty (uncertain questions are more likely to trigger conformity), and question type (factual questions are more likely to cause conformity than subjective ones). Quantitative analysis indicates that in some configurations, over half of the agents will abandon the correct answer, and even high-confidence initial answers may be swayed by group opinions.

5

Section 05

Implications for Collective Reasoning Systems

The conformity phenomenon has far-reaching implications for multi-agent application scenarios: 1. Code review: If review agents conform, they may overlook defects; 2. Decision support: Discussions may reduce decision quality and lead to groupthink; 3. Knowledge generation/fact-checking: Erroneous information is reinforced through mutual citations, forming an echo chamber effect that is difficult to correct externally.

6

Section 06

Mitigation Strategies and Design Recommendations

Mitigation strategies include: 1. Architecture improvement: Introduce heterogeneous agents (different models, training data, or reasoning strategies); 2. Process optimization: Anonymization (unable to identify output sources), sequential isolation (not seeing peers' answers when making initial judgments); 3. Confidence weighting: Assign higher weights to high-confidence answers when aggregating opinions; 4. Devil's advocate mechanism: Design agents to challenge mainstream views to prevent premature convergence to erroneous consensus.

7

Section 07

Future Research Directions

Open questions include: Differences in conformity tendencies among different model architectures (e.g., Transformer vs. others); the impact of fine-tuning on conformity behavior; the accumulation/attenuation of conformity effects in multi-turn dialogues. In addition, it is necessary to develop evaluation metrics and benchmark tests (to quantify the "conformity resistance" of systems), as well as case studies in real scenarios (such as code review and medical diagnosis) to verify the effectiveness of theories and strategies.