Reading

Nationality Bias in Large Language Models: Representational Harm to the Global Majority in AI Narratives

Recent research reveals the systemic nationality bias in mainstream large language models (LLMs) when generating narratives. It finds that the national identities of non-Western countries are severely stereotyped and marginalized in AI-generated stories, with the probability of negative portrayals being more than 50 times higher than positive ones.

LLM偏见国籍刻板印象AI伦理代表性伤害全球多数群体文化多样性大语言模型算法公平

Published 2026-04-25 01:49Recent activity 2026-04-27 09:48Estimated read 14 min

Nationality Bias in Large Language Models: Representational Harm to the Global Majority in AI Narratives

Section 01

[Introduction] Nationality Bias in Large Language Models: Representational Harm to the Global Majority

Recent research reveals the systemic nationality bias in mainstream large language models (LLMs) when generating narratives. The national identities of non-Western countries are severely stereotyped and marginalized, with the probability of negative portrayals being more than 50 times higher than positive ones. This study focuses on the cultural bias issue of LLMs, exploring its background, methodology, findings, and response strategies, aiming to reveal the representational harm faced by the Global Majority in AI narratives.

Section 02

Research Background: Hidden Risks of Cultural Bias in AI Narratives

With the widespread deployment of large language models (LLMs) in daily life, business applications, and even government decision-making, an increasingly serious issue has emerged: are these models inadvertently encoding and spreading harmful biases against the Global Majority—non-Western, non-white groups?

From simulating refugee interviews to creative writing, LLM-generated texts are shaping people's perceptions of different cultural groups around the world. However, these models are mainly trained on English corpora, and their cultural perspective is often U.S.-centric. When asked to generate narratives involving characters of specific nationalities, do these models present these groups fairly and diversely, or do they fall into the trap of stereotypes?

Section 03

Research Design and Methodology

This study uses open-ended narrative generation tasks to systematically evaluate the performance of multiple mainstream LLMs (including the GPT series, etc.) when handling different national identities.

Key design features include:

Nationality Diversity Coverage: The test covers national identities from multiple regions around the world, including countries in Africa, Asia, Latin America, the Middle East, etc., with Western countries like the U.S. as controls
Neutral Narrative Scenarios: Avoid presupposing any positive or negative context in prompts, allowing models to play freely to observe their inherent "default" narrative tendencies
Control Experiments: By replacing nationality identifiers in prompts (e.g., replacing "American" with other nationalities), test whether biases stem from simple sycophancy behavior

Section 04

Key Findings: Systemic Representational Harm

1. Absence of Power-Neutral Narratives

The study found that identities from marginalized countries are severely absent in "power-neutral" stories. When models are asked to generate daily scenarios without power relations, the frequency of non-Western nationality characters appearing is significantly lower than that of Western nationality characters like Americans. This "representational erasure" means the Global Majority is almost invisible in AI narratives.

2. Overrepresentation of Negative Stereotypes

More worrying is that when non-Western nationality characters do appear, they are highly likely to be placed in subordinate, vulnerable, or negative roles. Data shows that negative portrayals of identities from marginalized countries are more than 50 times higher than positive or dominant portrayals. This bias is not randomly distributed but shows obvious patterns:

Characters from developing countries are more likely to be described as poor, trapped, or in need of assistance
Nationalities from specific regions are more likely to be associated with conflict, violence, or social unrest
Characters' occupations, educational backgrounds, and social statuses are often stereotypically simplified

3. Amplification Effect of American Centrism

The study also found that when prompts include U.S. nationality identifiers like "American", the model's bias against other nationalities is further amplified. This indicates that there is an implicit cultural hierarchy within the model, placing the U.S. at the center of narratives while marginalizing other countries.

4. Bias Does Not Stem from Sycophancy

A key question is: Are these biases merely the model "pleasing" users (sycophancy)—i.e., because there is more U.S.-related content in training data, so the model tends to generate U.S.-centric content? The study ruled out this explanation through clever control experiments. Even when researchers replaced U.S. nationality in prompts with other nationalities, the model's bias pattern remained consistent. This shows that the problem is not that the model is catering to users, but that its internal representations themselves have systemic cultural biases.

Section 05

Underlying Mechanisms: Interplay Between Training Data and Algorithmic Bias

These representational harms are not accidental but reflect structural problems in the current LLM development paradigm:

Data Bias: The training data of mainstream LLMs mainly comes from the English Internet, where U.S. perspectives and Western narratives dominate. The culture, history, and social reality of the Global Majority are marginalized or one-sidedly presented in these data.

Annotation Bias: Even manual annotation or RLHF (Reinforcement Learning from Human Feedback) processes may introduce biases due to the cultural background of annotators. When annotation teams lack global diversity, they may turn a blind eye to certain cultural stereotypes.

Algorithmic Amplification: The self-attention mechanism of the Transformer architecture may amplify statistical patterns in the data during training, making originally implicit biases more prominent and solidified.

Section 06

Practical Impacts and Risks

These seemingly "abstract" bias issues actually have serious real-world consequences:

Refugee and Immigrant Assessment: When LLMs are used to simulate refugee interviews or evaluate immigration applications, negative biases against specific nationalities may directly affect decision fairness.

Content Moderation and Monitoring: If LLMs are used for content moderation, they may be more "sensitive" to user-generated content from certain countries, leading to unfair censorship or marking.

Education and Cultural Communication: When students or researchers use LLMs to generate narratives about different cultures, these biases will be further spread and reinforced, forming a vicious cycle.

Business Decisions and Market Bias: In a globalized business environment, LLM-based market analysis or customer profiling tools may hold biases against consumers in certain regions, affecting business fairness.

Section 07

Response Strategies and Future Directions

In the face of these systemic biases, researchers have proposed several key directions:

Global Majority-Centered Methodology: Traditional AI ethics research often starts from a Western perspective, but effective solutions need to place the experiences and needs of the Global Majority at the core. This means incorporating diverse voices in all aspects, including problem definition, data collection, and model evaluation.

Culturally Sensitive Data Curation: It is necessary to actively collect and organize high-quality, diverse corpora from around the world, especially representative content of languages and cultures marginalized in current training data.

Diverse Human Feedback: Annotation teams in the RLHF process need to be truly globally representative, including professionals from different countries, languages, and cultural backgrounds.

Bias Detection and Mitigation Tools: Develop assessment tools and mitigation technologies specifically for cultural bias and representational harm, and incorporate them into standard model development and deployment processes.

Policy and Regulatory Frameworks: International-level AI ethics standards and regulatory mechanisms need to be established to ensure that LLM development and application do not exacerbate global inequality.

Section 08

Conclusion: Beyond Technical Solutions

This study reveals not only technical issues but also deep-seated power and knowledge production issues. When LLMs systematically devalue the Global Majority, they reflect and reinforce the unequal structures existing in the real world.

Solving this problem cannot rely solely on technical fixes; it requires a transformation of the entire AI ecosystem—from who owns the data, who defines the problem, who evaluates the model, to who benefits from AI technology. Only when we truly regard global diversity as a core value rather than a marginal consideration can AI become a tool that promotes understanding rather than deepens divisions.

As researchers call for: We need to question the uncritical application of U.S.-developed LLMs to global classification, monitoring, and representation. The globalization of technology must advance in sync with the globalization of ethics.