Zing Forum

Reading

Predicting Bank Customer Churn Using Deep Neural Networks: A Complete Practice from Data Preprocessing to Model Optimization

This article introduces a bank customer churn prediction project based on feedforward neural networks, covering data exploration, preprocessing, comparison of six model architectures, and SMOTE optimization strategy for class imbalance issues, ultimately achieving a 74% recall rate for churned customers.

客户流失预测深度神经网络类别不平衡SMOTETensorFlowKeras机器学习工程银行业务召回率优化
Published 2026-06-02 10:44Recent activity 2026-06-02 10:51Estimated read 6 min
Predicting Bank Customer Churn Using Deep Neural Networks: A Complete Practice from Data Preprocessing to Model Optimization
1

Section 01

[Introduction] A Complete Practice of Predicting Bank Customer Churn Using Deep Neural Networks

This project is a practice of predicting bank customer churn based on feedforward neural networks, covering data exploration, preprocessing, comparison of six model architectures, and SMOTE optimization strategy for class imbalance, ultimately achieving a 74% recall rate for churned customers. The project comes from a GitHub repository (maintained by thehotpath) and is a complete machine learning engineering case, which has practical reference value for class imbalance issues in business scenarios.

2

Section 02

Project Background and Business Value

Customer churn is a core challenge in the banking industry, as the cost of acquiring new customers is much higher than maintaining existing ones. This project builds a model based on a dataset of 10,000 bank customers. Since only about 20% of customers in the data are churned (class imbalance), recall rate is set as the primary optimization target—missing a churned customer (false negative) costs more than misclassifying a loyal customer as churned (false positive).

3

Section 03

Dataset and Feature Engineering

The dataset contains 14 fields for 10k customers: numerical features (credit score, age, tenure, account balance, estimated income), categorical features (geographic location, gender), product-related features (number of products, credit card ownership, active membership status), and the target variable Exited (1 = churned). Preprocessing steps: remove identifiers like RowNumber, one-hot encode categorical variables, and standardize numerical features. EDA shows no obvious linear correlation between features, so dimensionality reduction is not needed.

4

Section 04

Model Architectures and Experimental Design

Six neural network configurations are compared: 1. Basic SGD network; 2. Adam-optimized network; 3. Adam + Dropout network; 4. SMOTE + SGD network;5. SMOTE + Adam network;6. SMOTE + Adam + Dropout network. SMOTE generates synthetic samples for the minority class via interpolation to alleviate class imbalance bias. The progressive experimental design allows clear observation of each component's contribution to performance.

5

Section 05

Key Finding: The Critical Role of SMOTE in Improving Recall Rate

SMOTE data balancing is the decisive factor in improving recall rate. The final SMOTE + Adam + Dropout model achieves an approximate 74% recall rate on the test set (identifying 301 out of 407 churned customers). Models without SMOTE have insufficient recall rates (missing more churned customers). Although the SMOTE model has more false positives (476 cases), it is acceptable in business terms (the cost of retaining loyal customers is lower than the loss of losing churned customers). The experimental process is documented with 45 visual charts (EDA, training curves, confusion matrices).

6

Section 06

Business Application Recommendations

Practical strategies based on model insights:

  1. Precision marketing: Push personalized offers and loyalty programs to high-risk customers;
  2. Lifecycle management: Provide financial planning for older customers and strengthen onboarding for new customers;
  3. Account activation: Incentivize zero-balance/inactive accounts to re-engage;
  4. Product cross-selling: Promote additional products to enhance customer stickiness;
  5. Regional strategy: Analyze the causes of high churn in regions and develop localized solutions.
7

Section 07

Technical Implementation and Reproducibility

The project is built with Python 3.10+ and TensorFlow/Keras, with a clear code structure (data pipeline, model definition, training scripts, evaluation metrics). Dependencies are listed in requirements.txt, the dataset is in the data/ directory, and results are reproducible. The complete analysis is in notebook/Bustos_INN_Learner_Notebook_Full_code.html, which can be viewed via a browser or nbviewer. The project uses the MIT license and can be freely used for learning and commercial purposes.