Zing Forum

Reading

Intersectional Fairness Study: Mainstream LLMs Exhibit Significant Biases in Racial-Gender Intersectional Dimensions

Systematic evaluations show that modern LLMs perform well in ambiguous contexts but lack sufficient information in fairness metrics; in explicit contexts, accuracy is affected by consistency with stereotypes, and biases in racial-gender intersectional dimensions are particularly prominent.

LLM公平性交叉性算法偏见种族性别AI伦理公平性评估刻板印象社会公正
Published 2026-04-22 23:25Recent activity 2026-04-23 09:55Estimated read 5 min
Intersectional Fairness Study: Mainstream LLMs Exhibit Significant Biases in Racial-Gender Intersectional Dimensions
1

Section 01

[Introduction] Intersectional Fairness Study Reveals Mainstream LLMs Exhibit Significant Biases in Racial-Gender Intersectional Dimensions

This study systematically evaluates the intersectional fairness of mainstream LLMs. Key findings: 1. In ambiguous contexts, models respond conservatively but lack sufficient information in fairness metrics; 2. In explicit contexts, accuracy is affected by consistency with stereotypes; 3. Biases in racial-gender intersectional dimensions are particularly prominent. The study emphasizes the critical significance of an intersectional perspective for AI fairness.

2

Section 02

Background: AI Fairness Research Needs to Focus on Intersectional Dimensions

Traditional fairness research focuses on single attributes (e.g., gender/race), but in reality, identities are multi-dimensional and intertwined. Intersectionality theory points out that combinations of multiple attributes produce unique patterns of discrimination (e.g., biases against Black women are not a simple superposition). Understanding intersectional fairness is a core prerequisite for building fair AI.

3

Section 03

Research Methods: Multi-dimensional Systematic Evaluation Framework

Two benchmark datasets were used for 6 mainstream LLMs, with evaluation dimensions including: 1. Bias score (systematic bias); 2. Subgroup fairness metrics (differences in result distribution); 3. Accuracy; 4. Consistency (answer stability). The experiments covered positive and negative question polarities, as well as ambiguous/explicit contexts.

4

Section 04

Core Evidence: Bias Performance of LLMs in Intersectional Dimensions

  1. Ambiguous contexts: Models often answer "unknown", which seems safe but lacks sufficient information in fairness metrics; 2. Explicit contexts: Accuracy fluctuates with consistency with stereotypes (e.g., accuracy decreases when identifying male nurses); 3. Racial-gender intersection: Stronger biases against Black women, and insufficient representation of intersectional groups leads to poor performance; 4. Uneven result distribution across subgroups; 5. Lack of consistency in answers increases the difficulty of bias mitigation.
5

Section 05

Research Conclusions: Model Capabilities Depend on Stereotype Cues, Intersectional Fairness Needs Improvement

Core conclusions: The surface capabilities of models partially depend on stereotype cues, and traditional accuracy may overestimate their capabilities; the fairness-capability trade-off may be an issue with evaluation methods; no LLM has achieved consistent fair behavior in intersectional dimensions. Intersectional fairness is a core challenge for AI applications in high-risk scenarios.

6

Section 06

Practical Recommendations: Improvement Directions for Intersectional Fairness

  • Developers: Increase data representation of intersectional groups, develop specialized fine-tuning strategies, establish intersectional fairness check standards;
  • Deployers: Test intersectional fairness in sensitive scenarios, monitor performance differences across groups, prepare mitigation mechanisms;
  • Researchers: Develop comprehensive benchmarks, study the effects of mitigation technologies, explore causal inference methods.
7

Section 07

Limitations and Future Directions: Expand Evaluation Dimensions and Cultural Contexts

Current limitations: Only covers racial-gender intersections, not including dimensions such as age/religion; based on English and Western contexts; static evaluation does not track dynamic biases; lack of research on specific mitigation technologies. Future work needs to expand dimensions, conduct cross-cultural evaluations, perform long-term tracking, and develop mitigation strategies.