# ThaiFACTUAL: Eliminating Large Model Bias in Thai Political Stance Detection via Counterfactual Calibration

> This article introduces the ThaiFACTUAL framework, a lightweight, model-agnostic calibration method designed to address systemic bias in large language models (LLMs) for political stance detection in low-resource languages, significantly improving fairness and accuracy without fine-tuning.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-06-02T06:44:29.000Z
- 最近活动: 2026-06-02T06:49:22.255Z
- 热度: 159.9
- 关键词: 大语言模型, 偏见消除, 政治立场检测, 泰语NLP, 反事实校准, AI公平性, 低资源语言, EMNLP 2025
- 页面链接: https://www.zingnex.cn/en/forum/thread/thaifactual
- Canonical: https://www.zingnex.cn/forum/thread/thaifactual
- Markdown 来源: floors_fallback

---

## Introduction: ThaiFACTUAL Framework – A Counterfactual Calibration De-biasing Solution for Thai Political Stance Detection

This article presents the ThaiFACTUAL framework developed by Teerapong Panboonyuen from Chulalongkorn University and MARSAIL. It is a lightweight, model-agnostic calibration method aimed at resolving systemic bias in large language models (LLMs) for political stance detection in low-resource languages such as Thai. The framework significantly enhances fairness and accuracy without requiring fine-tuning of the base model. Related research was published at the EMNLP 2025 Widening NLP (WiNLP) Workshop (Suzhou, China), and the source code is available on GitHub: https://github.com/kaopanboonyuen/ThaiFACTUAL (updated on 2026-06-02). Its core innovation lies in using counterfactual reasoning to separate stance signals from emotional noise, compatible with mainstream models like GPT-4 and LLaMA-3.

## Background: Bias Challenges in Thai Political Stance Detection

Political stance detection is a key NLP task that identifies the support/opposition/neutral attitude of text toward political entities (individuals, parties, policies). However, when LLMs are applied to low-resource languages like Thai, systemic bias is prominent:
1. **Thai Language Features**: Rich indirect expressions, entanglement of emotion and stance, and numerous polarized figures;
2. **Typical Biases**: Emotional leakage (equating positive emotion with supportive stance), entity preference (incorrectly associating specific political figures with fixed stances).
These issues cause model predictions to deviate from the true stance.

## Overview of ThaiFACTUAL Framework: Lightweight Model-Agnostic Post-Processing Calibration

The core design idea of ThaiFACTUAL is **post-processing calibration**, which does not require expensive model retraining or large amounts of labeled data and is compatible with any black-box/white-box LLM. It is based on the principle of counterfactual reasoning: by systematically swapping political entities in text and re-scoring, it separates real stance signals from emotional noise, thereby reducing bias.

## Detailed Technical Principles: Counterfactual Samples and Calibration Process

### Counterfactual Sample Construction
Given a text X containing political entity E:
1. Replace E with another entity E' to generate counterfactual text X';
2. Keep the emotional polarity of X' consistent with X;
3. Obtain the model's stance prediction probabilities for X and X'.

### RStd Bias Metric
Calculate the standard deviation of recall rates across different stance categories:
`RStd = sqrt( sum( (Recall_i - mean(Recall))^2 ) / N )`
A higher RStd indicates more severe model bias.

### Calibration Flow
1. Evaluate the original model's bias using RStd and Bias-SSC;
2. Generate multiple counterfactual versions for each sample;
3. Adjust the original probabilities based on the counterfactual prediction distribution;
4. Verify entity-level fairness.

## Experimental Results: Bias Reduction and Performance Improvement

Evaluation results on the Thai political stance dataset (compared to baseline methods):

| Method | Bias-SSC ↓ | RStd ↓ | F1 ↑ | OOD ↑ |
|--------|-----------|--------|------|-------|
| GPT-4 (Original) | 21.7 |15.2 |70.8 |56.4 |
| GPT-4 (De-biased Prompt) |18.3 |12.6 |71.9 |57.0 |
| LLaMA-3 (CoT Prompt) |16.5 |11.8 |68.1 |59.7 |
| **ThaiFACTUAL** |**9.8** |**6.4** |**73.5** |**65.2** |

**Key Findings**:
- Bias-SSC decreased by over 55%;
- RStd was significantly reduced, improving model prediction consistency;
- F1 score increased (without sacrificing accuracy);
- OOD generalization ability was enhanced.

## Practical Application Value: From Public Opinion Monitoring to Academic Research

### Social Media Public Opinion Monitoring
- Reduce content misjudgment caused by model bias;
- Fairly present diverse political views;
- Improve consistency in cross-entity content moderation.

### Academic Research Tools
- Analyze the distribution of political discourse;
- Track the evolution of public attitudes;
- Conduct cross-cultural/language stance comparison studies.

### Model Evaluation Benchmark
- Bias audit before model release;
- Compare de-biasing effects of different architectures;
- Monitor bias drift in production models.

## Limitations and Future Directions

**Limitations**:
1. Language Scope: Currently tailored for Thai;迁移 to other low-resource languages requires linguistic adaptation;
2. Entity Coverage: Relies on a predefined list of political entities; emerging entities need dynamic updates;
3. Computational Overhead: Counterfactual generation increases inference costs.

**Future Directions**:
1. Automated counterfactual sample generation;
2. Extend to more low-resource languages and cultural scenarios;
3. Combine fine-tuning and post-processing calibration to further reduce bias.

## Conclusion: A Practical Solution for AI Fairness in Low-Resource Languages

ThaiFACTUAL provides a practical de-biasing solution for political stance detection in low-resource languages. Its core contribution is proving that **without modifying model parameters**, post-processing calibration can significantly reduce LLM systemic bias. This approach is not only applicable to Thai but also provides a reference methodology for other low-resource languages and cultural environments. For researchers and developers focusing on AI fairness, low-resource NLP, or political text analysis, ThaiFACTUAL is a plug-and-play tool that helps build more fair and reliable detection systems.
