Reading

Analyzing Generative AI Ethical Risk Discourse on Social Media Using Large Language Models: A Systematic Study

This study uses large language models (LLMs) such as GPT-4.1 to perform zero-shot classification on nearly 50,000 tweets, combined with BERTopic topic modeling, to reveal the main areas of public concern regarding generative AI ethical risks.

生成式AI伦理风险大语言模型零样本分类BERTopic主题建模社交媒体分析AI治理

Published 2026-05-25 21:45Recent activity 2026-05-25 21:53Estimated read 6 min

Section 01

[Introduction] Analyzing Generative AI Ethical Risk Discourse on Social Media Using Large Language Models: A Systematic Study

This study uses large language models (LLMs) such as GPT-4.1 to perform zero-shot classification on nearly 50,000 tweets, combined with BERTopic topic modeling, to reveal five key areas of public concern regarding generative AI ethical risks. The study proposes an innovative methodology, providing a feasible path for large-scale social media discourse analysis, which has practical implications for policy formulation, enterprise risk management, and more.

Section 02

Research Background: Why Focus on Generative AI Ethical Risk Discourse

The popularity of generative AI (e.g., ChatGPT) has sparked widespread discussion, including concerns about potential risks. Understanding public concerns is crucial for policymakers, developers, etc., but traditional manual coding or small-scale surveys struggle to handle massive social media data. This study innovatively combines LLM zero-shot classification and topic modeling techniques to analyze nearly 50,000 ChatGPT-related tweets and identify types of ethical risks.

Section 03

Research Methodology: Detailed Explanation of the Two-Stage Analysis Framework

First Stage: LLM Zero-Shot Classification

A corpus of 48,398 ChatGPT-related tweets from January to March 2023 was constructed. Four LLMs including GPT-4.1 and GPT-3.5-turbo were used for zero-shot classification to determine whether a tweet is risk discourse based on five high-level ethical risk categories (technical safety, privacy data abuse, fairness and discrimination, malicious misuse, social and democratic risks). GPT-4.1 was verified to have the best performance.

Second Stage: BERTopic Topic Modeling

For the risk discourse tweets classified by GPT-4.1, BERTopic was used to identify 33 fine-grained subtopics, which were mapped back to the five risk categories by two coders.

Section 04

Research Findings: Dimensions of Public Concern About AI Ethical Risks

The study reveals the multi-dimensional public concerns about generative AI ethical risks, covering five categories from technical to social levels. It also distinguishes between "risk discourse" and "non-risk discourse" (e.g., user experience, technical praise, etc.), improving the accuracy of analysis. (Note: The paper is still under review; results are inferred based on public datasets and code.)

Section 05

Methodological Contributions: Breakthroughs in LLM-Assisted Social Science Research

Feasibility of Zero-Shot Classification: LLMs can effectively classify without specific training data, lowering the threshold for large-scale text analysis;
Multi-Model Comparison and Verification: Comparing four LLMs with traditional supervised learning methods, providing empirical evidence of LLMs' performance in social science tasks;
Reproducible Process: Publicly available code, prompts, annotated data, etc., to facilitate reproduction by other researchers;
Ethical Compliance: Only tweet IDs and annotation results are made public, complying with platform terms and academic ethics.

Section 06

Practical Implications: Application Prospects Across Multiple Fields

Policy Formulation: Helping regulatory agencies understand public focus and formulate targeted policies;
Enterprise Risk Management: Assisting AI companies in identifying user concerns and optimizing product design and communication;
Academic Research: Providing a reusable LLM-assisted analysis framework for computational social science;
Public Opinion Monitoring: Can be expanded into a real-time system to track the evolution of public risk concerns.

Section 07

Conclusion: Core Value and Insights of the Study

With the development of generative AI, understanding public perceptions of ethical risks is crucial. This study combines innovative methods of LLMs and topic modeling to provide a path for large-scale social media discourse analysis, revealing five risk areas, and also provides data resources and methodological references for fields such as AI governance and technology ethics.

Continue Reading

Keep going with more reads from the same topic.

SignalCut: An Intelligent Tool for Turning AI Search Visibility Gaps into Video Marketing Campaigns

SignalCut is an innovative web application that analyzes brands' visibility gaps in AI search, automatically generates evidence-based marketing strategies, and creates Hera video materials, helping early-stage brands gain a competitive edge in the AI answer engine era.

Recent activity 2026-04-26 11:27

AWS Open-Sources AI Search Citation Analysis System: Track Brand Exposure in AI Search Engines

An open-source project officially released by AWS, built on Amazon Bedrock, Step Functions, and React to form a complete serverless citation analysis system. It helps enterprises monitor their brand's citation status and competitive landscape in AI searches like ChatGPT, Perplexity, Gemini, and Claude.

Recent activity 2026-03-31 20:49

Next.js Application SEO and GEO Integrated Optimization Solution: Comprehensive Visibility from Search Engines to AI Assistants

This article delves into the stevewerme/seo-geo-nextjs project, an open-source tool designed specifically for Next.js applications to simultaneously optimize traditional search engine rankings (SEO) and generative engine visibility (GEO). It analyzes the project's core architecture, implementation mechanisms, practical application scenarios, and its strategic significance for developers and content creators.

Recent activity 2026-04-03 14:48

Baiyuan GEO Platform Technical White Paper: SaaS Engineering Practice for Generative Engine Optimization (GEO)

This article deeply analyzes the GEO Platform technical white paper developed by Baiyuan Technology, covering the seven-dimensional AI citation rate scoring algorithm, AXP shadow document delivery mechanism, Schema.org three-layer entity knowledge graph, and the hallucination automatic detection and repair closed-loop system, providing an engineering solution for brands to gain visibility in generative AI such as ChatGPT and Claude.

Recent activity 2026-04-18 22:54