# Customer Satisfaction Prediction: Practical Applications of Machine Learning and Deep Learning

> Build an application for predicting customer satisfaction, using a combination of machine learning and deep learning models, focusing on handling class imbalance issues, and selecting the optimal model through multi-metric evaluation

- 板块: [Openclaw Geo](https://www.zingnex.cn/en/forum/board/openclaw-geo)
- 发布时间: 2026-06-11T08:45:46.000Z
- 最近活动: 2026-06-11T09:06:57.785Z
- 热度: 155.7
- 关键词: 客户满意度, 机器学习, 深度学习, 类别不平衡, XGBoost, 客户分析
- 页面链接: https://www.zingnex.cn/en/forum/thread/geo-github-ankurbhatt07-customer-satisfaction-score-prediction-using-ml-dl
- Canonical: https://www.zingnex.cn/forum/thread/geo-github-ankurbhatt07-customer-satisfaction-score-prediction-using-ml-dl
- Markdown 来源: floors_fallback

---

## Introduction / Main Post: Customer Satisfaction Prediction: Practical Applications of Machine Learning and Deep Learning

Build an application for predicting customer satisfaction, using a combination of machine learning and deep learning models, focusing on handling class imbalance issues, and selecting the optimal model through multi-metric evaluation

## Original Author and Source

- **Original Author/Maintainer**: AnkurBhatt07
- **Source Platform**: GitHub
- **Original Title**: Customer-Satisfaction-Score-Prediction-using-ML-DL
- **Original Link**: https://github.com/AnkurBhatt07/Customer-Satisfaction-Score-Prediction-using-ML-DL
- **Publication Date**: 2026-06-11

## Why is Customer Satisfaction So Important?

In a highly competitive business environment, the cost of acquiring new customers is 5-25 times that of retaining existing ones. Customer Satisfaction Score (CSAT) is a core metric for measuring customer experience, directly impacting customer retention, word-of-mouth, and final revenue.

Traditional satisfaction surveys rely on post-event questionnaires, which have a lag. Predictive analytics can identify risks before customers express dissatisfaction, giving businesses the opportunity to intervene proactively. This project demonstrates how to build an end-to-end customer satisfaction prediction system using a combination of machine learning and deep learning technologies.

## Prediction Objectives

Based on customers' historical behavior data, transaction records, and service interaction information, predict customers' satisfaction scores for services (usually 1-5 points or a binary classification of satisfied/dissatisfied).

## Key Challenges

**Class Imbalance**
- Satisfied customers are usually far more than dissatisfied ones
- Extreme scores (1 or 5) may be more common than middle scores
- Standard models tend to predict the majority class

**Feature Complexity**
- Customer data includes numerical features (consumption amount, usage duration) and categorical features (region, product type)
- Time-series features (trends in purchase frequency changes)
- Text features (customer service chat records, comments)

**Data Quality Issues**
- Missing values (some customers did not fill in certain information)
- Outliers (large abnormal transactions)
- Data entry errors

## Data Cleaning

**Missing Value Handling**
- Numerical features: fill with median or mean, or predict and fill based on other features
- Categorical features: fill with mode or create an "Unknown" category
- Features with high missing rate (>50%): consider deletion or special handling

**Outlier Detection and Handling**
- IQR method: identify data points outside 1.5 times the interquartile range
- Z-score: mark outliers with |z|>3
- Business rules: e.g., a single transaction exceeding 10 times the customer's historical average

**Data Type Conversion**
- Convert date strings to datetime objects
- Categorical encoding: One-hot or Label encoding
- Text vectorization: TF-IDF or word embedding

## Feature Engineering

**Time Feature Extraction**
- Customer lifecycle: number of days since first purchase
- Activity: number of days since last purchase (Recency)
- Frequency: number of purchases in the past 30/90/365 days
- Amount: average order value, total consumption amount

**RFM Model Features**
- Recency: number of days since the customer's last purchase
- Frequency: number of purchases
- Monetary: cumulative consumption amount
- RFM is a classic framework for customer value analysis

**Interaction Features**
- Create feature combinations, e.g., "consumption amount × purchase frequency"
- Capture non-linear relationships

**Feature Scaling**
- Standardization (StandardScaler): mean 0, variance 1
- Normalization (MinMaxScaler): scale to [0,1]
- Especially important for neural networks

## Resampling Methods

**Oversampling**

- **Random Oversampling**: duplicate minority class samples; simple but prone to overfitting

- **SMOTE (Synthetic Minority Over-sampling Technique)**: generate new samples by interpolating between minority class samples; alleviates overfitting issues

- **ADASYN (Adaptive Synthetic Sampling)**: adaptively generate samples, focusing on hard-to-learn samples

**Undersampling**

- **Random Undersampling**: randomly delete majority class samples; may lose important information

- **Tomek Links**: delete pairs of samples from different classes that are nearest neighbors to each other; clean class boundaries

- **Edited Nearest Neighbors**: delete majority class samples that are misclassified

**Hybrid Strategies**
- SMOTE + Tomek Links
- Oversample first then undersample
