Zing Forum

Reading

Bank Marketing Machine Learning Project: Data Science Practice from Loss to Profit

BYU-Idaho CSE 450 course team project uses SMOTE oversampling, class weights, and probability threshold tuning to solve class imbalance issues, turning telemarketing from loss to profit.

类别不平衡SMOTE随机森林集成学习银行营销机器学习业务价值精确率优化
Published 2026-05-17 10:45Recent activity 2026-05-17 10:57Estimated read 4 min
Bank Marketing Machine Learning Project: Data Science Practice from Loss to Profit
1

Section 01

[Introduction] Bank Marketing Machine Learning Project: Core Practices from Loss to Profit

The BYU-Idaho CSE450 course team project addresses the class imbalance problem in bank telemarketing (only 11.4% positive cases) by using SMOTE oversampling, class weights, probability threshold tuning, and multi-model comparison (random forest, ensemble stacking, etc.), turning marketing campaigns from loss to profit while generating interpretable customer insights.

2

Section 02

[Background] Dilemmas of Bank Telemarketing and Dataset Challenges

Bank telemarketing faces problems of low efficiency and high cost, especially class imbalance (10-15% positive cases) which makes traditional models have no business value. The project uses the UCI Bank Marketing Dataset (about 37,000 records, 11.4% positive cases), with features covering customer basic information, financial status, contact history, and macroeconomic indicators.

3

Section 03

[Methods] Class Imbalance Handling and Multi-Model Solutions

The team built three models: 1. Random Forest + SMOTE oversampling + manual class weights; 2. Balanced Random Forest (automatic class weights); 3. Ensemble Stacking (RF + KNN base learners + Logistic Regression meta-learner, threshold tuned to 0.61). Class imbalance handling techniques include SMOTE interpolation to generate synthetic samples, class weight adjustment, and probability threshold optimization.

4

Section 04

[Evidence] Business Value Improvement and Customer Group Insights

Without ML screening, the test set lost $157; after the best model, it made a profit of $824, and scaling to 4119 people is expected to yield a profit of $7775. Customer insights: High conversion groups (historical converted customers, students, retirees), low conversion groups (fixed-line customers, blue-collar workers), precision rate increased from 11.5% to 47.2%.

5

Section 05

[Conclusion] Project Insights and Best Practices

Key success factors: Business problem-driven (focus on profit goals), choose business-relevant evaluation metrics (not accuracy), emphasize interpretability (generate customer insights), iterative optimization (from simple to complex models).

6

Section 06

[Expansion Directions] Future Exploration in Model, Business, and Technology Aspects

Model aspect: Deep learning, time series, reinforcement learning; Business aspect: Personalized recommendation, contact timing optimization, customer lifetime value prediction; Technology aspect: Real-time inference, A/B testing, model monitoring.