Zing Forum

Reading

Representational Harm in Large Language Model Narratives: Stereotypes and Erasure Faced by the Identities of Most Countries Globally

Recent research reveals that mainstream LLMs systematically cause representational harm to the identities of most countries globally when generating narratives, including stereotypes, identity erasure, and one-dimensional portrayal. The study found that minoritized national identities are underrepresented in power-neutral stories, while being overrepresented in subordinate roles—with the latter appearing more than fifty times more likely than dominant roles.

表征伤害大语言模型偏见全球多数AI伦理刻板印象国籍偏见
Published 2026-04-25 01:49Recent activity 2026-04-27 13:53Estimated read 7 min
Representational Harm in Large Language Model Narratives: Stereotypes and Erasure Faced by the Identities of Most Countries Globally
1

Section 01

[Introduction] Core Findings of Research on Systemic Representational Harm to Most Global National Identities by Mainstream LLMs

Recent research reveals that mainstream large language models (LLMs) systematically cause representational harm to the identities of most countries globally when generating narratives, including stereotypes, identity erasure, and one-dimensional portrayal. The study found that minoritized national identities are underrepresented in power-neutral stories, while appearing more than fifty times more likely in subordinate roles than in dominant roles. Additionally, there is an amplifying effect of American-centric bias. These findings raise important warnings for AI ethics and high-risk applications.

2

Section 02

Research Background and Motivation

Large language models (LLMs) are being widely used in daily conversations, as well as high-risk scenarios in enterprises and governments (e.g., simulating asylum seeker interviews). While the potential of LLMs has received attention, the risk of encoding and perpetuating harmful biases against non-dominant global communities is often overlooked. To assess and mitigate such harm, it is necessary to deeply study how LLMs portray diverse individual identities.

3

Section 03

What is Representational Harm? Definition and Manifestations

Representational harm refers to the phenomenon where AI systems distort, stereotype, or erase specific groups when generating content. Unlike direct discrimination, it reinforces social biases through repeated presentation of stereotypes. Its manifestations in LLM narratives include: fixing people of certain nationalities into specific roles, ignoring cultural complexity, and marginalizing minority groups.

4

Section 04

Research Methods and Design

The research team examined how mainstream LLMs portray identities of different national origins through open-ended narrative generation prompts. The design covers multiple dimensions: presentation frequency in power-neutral stories, distribution of role types, and social status implications implied in narratives. By systematically comparing the presentation of different national identities, the degree and patterns of representational harm were quantified.

5

Section 05

Key Findings: Patterns of Nationality-Related Representational Bias

The study found persistent nationality-based representational harm, including harmful stereotypes, identity erasure, and one-dimensional portrayal of most global identities. Specifically, minoritized national identities are underrepresented in power-neutral stories, while being overrepresented in subordinate roles—with the latter being more than fifty times more likely than dominant roles. This imbalance tends to place certain groups in subordinate positions, reinforcing real-world structures of inequality.

6

Section 06

Amplifying Effect of American-Centric Bias

When input prompts contain American nationality cues (e.g., "American"), the degree of harm is amplified, indicating that LLMs are sensitive to specific cultural symbols and reinforce an American-centric narrative perspective. Moreover, these harms cannot be explained by "flattery": replacing American cues with non-American identities still leaves the American-centric bias intact, suggesting that the bias is embedded within the model rather than being a surface-level catering.

7

Section 07

Warnings for High-Risk AI Applications

These findings raise serious warnings for the application of LLMs in high-risk scenarios: when used in simulating asylum seeker interviews, generating corporate training materials, or assisting government decision-making, systemic representational harm may lead to persistent misunderstanding and discrimination against specific groups. Researchers call for a more cautious and critical attitude towards the use of US-developed LLMs for classifying, monitoring, and misrepresenting most people globally.

8

Section 08

Mitigation Directions and Future Outlook

Mitigation recommendations: 1. Increase content from most global perspectives in training data to break American-centric data bias; 2. Develop specialized evaluation metrics and test benchmarks for representational harm; 3. Establish feedback mechanisms centered on affected communities; 4. Implement human review and multi-level safety checks in high-risk scenarios.

Conclusion: Technological progress must be combined with social responsibility. As narrative tools, LLMs both reflect and shape reality. Identifying and mitigating representational harm is a dual challenge of technology and ethics. Future AI development needs to be more inclusive of diverse perspectives to ensure it serves all humanity rather than reinforcing existing power structures.