# Discourse Analysis of Generative AI on Reddit: A Computational Social Science Practice of Social Media Mining

> A computational social science project based on Reddit data, using sentiment analysis and BERTopic topic modeling techniques to analyze the discussion patterns, emotional tendencies, and discourse evolution of generative AI across different communities, revealing the public perception landscape of technology socialization.

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-05-18T03:38:08.000Z
- 最近活动: 2026-05-18T03:53:17.847Z
- 热度: 154.8
- 关键词: 社交媒体挖掘, 情感分析, BERTopic, 主题建模, 生成式AI, Reddit, 计算社会科学, 公众话语, 技术社会学, 自然语言处理
- 页面链接: https://www.zingnex.cn/en/forum/thread/reddit-ai
- Canonical: https://www.zingnex.cn/forum/thread/reddit-ai
- Markdown 来源: floors_fallback

---

## Introduction: Core Research Framework of Generative AI Discourse Analysis on Reddit

This study is a computational social science practice based on Reddit data. Using sentiment analysis and BERTopic topic modeling techniques, it systematically analyzes the discussion patterns, emotional tendencies, and discourse evolution of generative AI across different communities, aiming to reveal the public perception landscape in the process of technology socialization. The research focuses on the discourse differences among different types of communities (creative, technical, comprehensive), explores the temporal changes in generative AI discussions, and provides references for AI developers, policymakers, and researchers.

## Research Background and Problem Awareness

Generative AI has moved from laboratories to daily tools for the public, but traditional technology evaluation mostly focuses on model performance and ignores feedback from real user scenarios. Social media platforms (such as Reddit) are important windows to observe "technology socialization", and users' spontaneous discussions form a valuable data source. This study attempts to answer: How do different communities discuss generative AI? What are the differences in emotional tones? What are the dominant topics and their evolution trends? What are the discourse framework differences among different types of communities?

## Data Sources and Research Method Design

**Data Collection**: Obtain post data (ID, subreddit, title, body, etc.) and comment data (recursively flatten nested structures) from Reddit's public JSON endpoints, with request intervals set in compliance with platform rules.
**Community Selection**: Stratified sampling of 9 subreddits, covering creative (e.g., r/Midjourney), technical (e.g., r/OpenAI), and comprehensive (e.g., r/technology) categories, to compare the discourse characteristics of users from different backgrounds.
**Preprocessing**: Clean raw text (unify case, remove URLs/stop words, lemmatize, etc.) to retain semantic information and standardize input formats.

## Analysis Techniques: Sentiment Analysis and BERTopic Modeling

**Sentiment Analysis**: Adopt a hybrid method of lexicon + machine learning to distinguish the emotional patterns of posts (topic initiation) and comments (discussion participation), and identify the impact of events on public sentiment in combination with the temporal dimension.
**BERTopic Topic Modeling**: Use pre-trained language models to encode document vectors, then perform UMAP dimensionality reduction, HDBSCAN clustering, and c-TF-IDF to extract topic labels, generating interpretable discussion topics (e.g., technical tutorials, ethical concerns, tool evaluations, etc.).

## Key Findings: Cross-Community Differences and Discourse Evolution

**Cross-Community Comparison**: Creative communities focus on work display and usage skills (AI as a creative tool); technical communities focus on model principles and performance optimization (AI as a technical system); comprehensive communities discuss social impacts and future trends (AI as a social force).
**Temporal Dimension**: Discussion popularity fluctuates with major technical events (e.g., ChatGPT release, GPT-4 launch); topic focus shifts from basic functions to advanced skills, critical reflection, and ecosystem building, reflecting changes in technology maturity.

## Practical Significance and Research Limitations

**Practical Value**: Provide AI developers with a macro view of user feedback to guide product design; provide policymakers with real-time public perception references; show researchers the methodological path of computational social science.
**Limitations**: Reddit user groups have demographic biases; platform characteristics (anonymity, voting mechanism) affect discourse expression; sentiment analysis accuracy is challenged by colloquial/satirical texts.

## Conclusion and Outlook

This study reveals the public discourse characteristics of generative AI through social media mining, providing references for multiple stakeholders. In the future, this analysis framework can be used to continuously track discourse evolution, identify emerging topics, and deepen the understanding of the interaction between technology and society.
