Business Background
In the highly competitive financial services industry, customer churn prediction is crucial. Accurately identifying high-risk churn customers can help banks take timely intervention measures, develop personalized retention strategies, reduce customer acquisition costs, and enhance customer lifetime value.
Dataset Analysis
The dataset used in the project includes multi-dimensional customer information: demographic features (age, gender, geographic location), account information (credit score, balance, number of products), behavioral features (active membership status, estimated income), and the target variable (whether the customer churned). Data preprocessing includes handling missing values, encoding categorical variables (one-hot encoding), feature standardization, and adopting corresponding strategies for the class imbalance problem where the proportion of churned customers is low.