# Multi-label Text Sentiment Classification: Comparative Experiments of Five Machine Learning Models Based on the GoEmotions Dataset

> A multi-label sentiment classification course project by a Vietnamese student team, comparing the performance of five algorithms (Logistic Regression, LinearSVC, Random Forest, 1D CNN, and Bi-LSTM) on Google's GoEmotions dataset

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-06-02T23:45:28.000Z
- 最近活动: 2026-06-02T23:48:43.395Z
- 热度: 145.9
- 关键词: 多标签分类, 情感分析, GoEmotions, NLP, 机器学习, 深度学习, Bi-LSTM, CNN, 随机森林, 文本分类
- 页面链接: https://www.zingnex.cn/en/forum/thread/goemotions
- Canonical: https://www.zingnex.cn/forum/thread/goemotions
- Markdown 来源: floors_fallback

---

## Comparison of Multi-label Sentiment Classification Models: Experimental Study by a Vietnamese Student Team Based on GoEmotions

A course project by a Vietnamese student team, focusing on the multi-label text sentiment classification task. It compares the performance of five algorithms (Logistic Regression, LinearSVC, Random Forest, 1D CNN, and Bi-LSTM) on Google's GoEmotions dataset, and discusses technical challenges and optimization strategies in multi-label scenarios.

## Project Background and Challenges of Multi-label Sentiment Classification

In the field of natural language processing, sentiment analysis has evolved from binary classification to multi-dimensional recognition. Google released the GoEmotions dataset in 2021, which contains 58,000 Reddit comments labeled with 28 fine-grained emotion categories. Multi-label classification faces unique challenges: a text may carry multiple emotion labels simultaneously, and label sparsity and co-occurrence patterns make it difficult to directly apply traditional methods.

## System Architecture and Experimental Methods

The project uses an end-to-end pipeline architecture:
1. Preprocessing: Lowercase conversion, special character removal, tokenization, lemmatization;
2. Word vectors: TF-IDF (for traditional models), Word2Vec (for deep learning models);
3. Classification models: Traditional methods (Logistic Regression OVR, LinearSVC OVR, Random Forest), deep learning methods (1D CNN, Bi-LSTM);
4. Threshold optimization: Independently tune thresholds for each label to maximize F1 score.

## Experimental Results and Analysis of Model Characteristics

**Quantitative Evaluation**: The dataset has extreme class imbalance. High-frequency categories (e.g., amusement, gratitude, love, neutral) perform well, while low-frequency categories (e.g., sadness, pride) and categories with ambiguous semantic boundaries are prone to confusion;
**Qualitative Comparison**:
- Logistic Regression/LinearSVC: Perform well in linear scenarios but are weak in handling non-linear semantic combinations;
- Random Forest: Robust in handling contradictory emotions;
- 1D CNN: Strong at local feature extraction, excellent performance on short texts;
- Bi-LSTM: Strong ability to maintain long-distance semantic dependencies, suitable for complex mixed emotions.

## Practical Testing and Technical Implementation

**Practical Testing**: Four challenging use cases (coexistence of multiple emotions, contradictory semantics, clear features, complex mixing) were designed to simulate real scenarios;
**Technical Implementation**: Google Colab platform was used, with code organized in Jupyter Notebook, supporting GPU acceleration and real-time demonstration. The code is modular for easy reproduction.

## Project Insights and Extended Reflections

Reference value of the project for industrial-grade systems:
- Threshold tuning is a necessary engineering practice in multi-label scenarios;
- Model selection needs to balance scenario requirements (CNN is suitable for short texts, LSTM for long texts, traditional methods have strong interpretability);
- Data quality is more important than model complexity;
This project provides a complete learning example of multi-label classification for beginners.
