Zing Forum

Reading

Analyzing Generative AI Ethical Risk Discourse on Social Media Using Large Language Models: A Systematic Study

This study uses large language models (LLMs) such as GPT-4.1 to perform zero-shot classification on nearly 50,000 tweets, combined with BERTopic topic modeling, to reveal the main areas of public concern regarding generative AI ethical risks.

生成式AI伦理风险大语言模型零样本分类BERTopic主题建模社交媒体分析AI治理
Published 2026-05-25 21:45Recent activity 2026-05-25 21:53Estimated read 6 min
Analyzing Generative AI Ethical Risk Discourse on Social Media Using Large Language Models: A Systematic Study
1

Section 01

[Introduction] Analyzing Generative AI Ethical Risk Discourse on Social Media Using Large Language Models: A Systematic Study

This study uses large language models (LLMs) such as GPT-4.1 to perform zero-shot classification on nearly 50,000 tweets, combined with BERTopic topic modeling, to reveal five key areas of public concern regarding generative AI ethical risks. The study proposes an innovative methodology, providing a feasible path for large-scale social media discourse analysis, which has practical implications for policy formulation, enterprise risk management, and more.

2

Section 02

Research Background: Why Focus on Generative AI Ethical Risk Discourse

The popularity of generative AI (e.g., ChatGPT) has sparked widespread discussion, including concerns about potential risks. Understanding public concerns is crucial for policymakers, developers, etc., but traditional manual coding or small-scale surveys struggle to handle massive social media data. This study innovatively combines LLM zero-shot classification and topic modeling techniques to analyze nearly 50,000 ChatGPT-related tweets and identify types of ethical risks.

3

Section 03

Research Methodology: Detailed Explanation of the Two-Stage Analysis Framework

First Stage: LLM Zero-Shot Classification

A corpus of 48,398 ChatGPT-related tweets from January to March 2023 was constructed. Four LLMs including GPT-4.1 and GPT-3.5-turbo were used for zero-shot classification to determine whether a tweet is risk discourse based on five high-level ethical risk categories (technical safety, privacy data abuse, fairness and discrimination, malicious misuse, social and democratic risks). GPT-4.1 was verified to have the best performance.

Second Stage: BERTopic Topic Modeling

For the risk discourse tweets classified by GPT-4.1, BERTopic was used to identify 33 fine-grained subtopics, which were mapped back to the five risk categories by two coders.

4

Section 04

Research Findings: Dimensions of Public Concern About AI Ethical Risks

The study reveals the multi-dimensional public concerns about generative AI ethical risks, covering five categories from technical to social levels. It also distinguishes between "risk discourse" and "non-risk discourse" (e.g., user experience, technical praise, etc.), improving the accuracy of analysis. (Note: The paper is still under review; results are inferred based on public datasets and code.)

5

Section 05

Methodological Contributions: Breakthroughs in LLM-Assisted Social Science Research

  1. Feasibility of Zero-Shot Classification: LLMs can effectively classify without specific training data, lowering the threshold for large-scale text analysis;
  2. Multi-Model Comparison and Verification: Comparing four LLMs with traditional supervised learning methods, providing empirical evidence of LLMs' performance in social science tasks;
  3. Reproducible Process: Publicly available code, prompts, annotated data, etc., to facilitate reproduction by other researchers;
  4. Ethical Compliance: Only tweet IDs and annotation results are made public, complying with platform terms and academic ethics.
6

Section 06

Practical Implications: Application Prospects Across Multiple Fields

  • Policy Formulation: Helping regulatory agencies understand public focus and formulate targeted policies;
  • Enterprise Risk Management: Assisting AI companies in identifying user concerns and optimizing product design and communication;
  • Academic Research: Providing a reusable LLM-assisted analysis framework for computational social science;
  • Public Opinion Monitoring: Can be expanded into a real-time system to track the evolution of public risk concerns.
7

Section 07

Conclusion: Core Value and Insights of the Study

With the development of generative AI, understanding public perceptions of ethical risks is crucial. This study combines innovative methods of LLMs and topic modeling to provide a path for large-scale social media discourse analysis, revealing five risk areas, and also provides data resources and methodological references for fields such as AI governance and technology ethics.