# Identifiable Victim Effect in Large Language Models: When Narrative Trumps Numbers

> This article explores the cognitive bias exhibited by large language models (LLMs) when making decisions involving human lives—the Identifiable Victim Effect (IVE)—and how model alignment and reasoning capabilities amplify this bias.

- 板块: [Openclaw Llm](https://www.zingnex.cn/en/forum/board/openclaw-llm)
- 发布时间: 2026-05-03T15:13:16.000Z
- 最近活动: 2026-05-03T15:20:50.270Z
- 热度: 150.9
- 关键词: 大语言模型, 认知偏差, 可识别受害者效应, AI对齐, RLHF, 决策科学, AI伦理, 行为经济学
- 页面链接: https://www.zingnex.cn/en/forum/thread/llm-github-starscream-11813-ive-llm
- Canonical: https://www.zingnex.cn/forum/thread/llm-github-starscream-11813-ive-llm
- Markdown 来源: floors_fallback

---

## [Introduction] Identifiable Victim Effect in Large Language Models: How Narrative Trumps Numbers

This article explores the 'Identifiable Victim Effect' (IVE) in large language models (LLMs) when making decisions involving human lives—i.e., the tendency to prioritize saving specific, identifiable individuals over a larger number of abstract groups. The study found that models trained with RLHF alignment exhibit more pronounced biases, and stronger reasoning capabilities may instead 'rationalize' emotion-driven choices. The article also analyzes the underlying mechanisms, practical application impacts, and mitigation strategies, reminding us of the need to strike a balance between AI 'humanization' and rational fairness.

## Background: What is the Identifiable Victim Effect?

The Identifiable Victim Effect was proposed by Thomas Schelling in 1968, describing how humans feel far more compassion for specific individuals than for abstract groups. In classic experiments, showing stories and photos of a specific child generated more donations than statistical data like 'thousands of children waiting for organ transplants', revealing that human decisions are easily triggered by concrete narratives rather than numbers.

## Experimental Design: Methods to Verify IVE in LLMs

The research team used binary choice tasks, controlling three variables:
1. **Identifiability**: One option includes a specific individual's name and story, while the other only uses statistical numbers;
2. **Quantity**: The number of lives in the identifiable option is fewer than that in the statistical option;
3. **Scenario**: Medical allocation, disaster relief, etc. The IVE bias of LLMs was quantified through multiple sets of controlled experiments.

## Research Evidence: IVE Performance in LLMs and Key Findings

Experimental results show:
1. **Basic IVE Exists**: Under zero-shot prompts, the probability of models choosing identifiable individuals is significantly higher than random;
2. **Alignment Amplification Effect**: The bias of RLHF-trained models is 15-30% higher than that of pre-trained models;
3. **Double-Edged Sword of Reasoning**: After enabling Chain-of-Thought reasoning, models use complex reasoning to 'rationalize' emotion-driven choices (motivated reasoning).

## Deep Mechanisms: Reasons for IVE Bias in LLMs

Sources of bias include:
1. **Training Data**: Massive texts contain human narrative preferences, so models learn to focus on stories;
2. **Tension in Alignment Goals**: When RLHF pursues 'helpfulness', it strengthens responses to human emotional needs, inadvertently amplifying bias;
3. **Interaction Between Reasoning and Bias**: When models 'think step by step', they elaborate on the value of identifiable individuals and find reasons for emotional preferences, similar to human motivated reasoning.

## Practical Implications and Mitigation Strategies

**Application Impacts**:
- Medical Decision-Making: Resources may be skewed toward 'story-rich' cases;
- Disaster Response: Prioritize media-exposed events;
- Policy Analysis: Underestimate the value of systematic solutions.
**Mitigation Directions**:
1. Explicit debiasing prompts;
2. Adversarial training;
3. Multi-model integration;
4. Human-machine collaborative review.

## Broader Implications and Conclusion

The study reveals a paradox: Making AI 'human-like' may inherit human biases. We need to consider alignment goals (imitating humans vs. being more rational and fair) and the trade-off between efficiency and fairness. Technology needs to integrate psychology and behavioral economics. The conclusion reminds us: AI is not a purely rational machine; when deployed in high-risk scenarios, we need to address biases, balance 'being like humans' and 'being rational', and serve human well-being. At the same time, we should reflect: Can humans themselves make fair decisions?