Customer churn prediction in imbalanced datasets with resampling methods: A comparative study

Seyed Jamal Haddadi,Aida Farshidvard,Fillipe Dos Santos Silva,Julio Cesar Dos Reis,Marcelo Da Silva Reis

doi:10.1016/j.eswa.2023.123086

Abstract

Customer churn presents a significant challenge for businesses in the era of subscription-based services because retaining customers plays a key role in sustained growth. Existing techniques for automatic churn prediction suffer from a primary challenge inherent in datasets as their significant disproportion between majority and minority classes, which may result in model bias favoring the dominant class. This study presents a comprehensive analysis of Customer Churn Prediction (CCP) with a focus on three public highly imbalanced datasets. The explored datasets span diverse business sectors, including telecommunications, online retail, and banking. We employ a comparative analysis regarding fourteen distinct classification methods considering popular resampling strategies, namely the Synthetic Minority Over-sampling Technique (SMOTE) and the Adaptive Synthetic Sampling (ADASYN). In particular, we investigate a specific configuration that combines a novel two-phase resampling method predicated on both clustering and ensemble techniques in conjunction with Long Short-Term Memory (LSTM) networks. Our findings demonstrate competitive effectiveness, underscoring its potential for effective imbalance correction by further enhancing prediction accuracy. Achieved results suggest that in almost all instances, the integrated approach outperforms the standalone methods across different scenarios in the three datasets, particularly in terms of the Area Under the Curve (AUC). This research represents a significant contribution to the field of churn prediction for addressing class imbalance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Customer churn prediction in imbalanced datasets with resampling methods: A comparative study

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications

Lead the way for us

Journal: Expert Systems with Applications	Publication Date: Jan 13, 2024
Citations: 2

Similar Papers

Sampling-based novel heterogeneous multi-layer stacking ensemble method for telecom customer churn prediction
Fatima E Usman-Hamza ... Modinat A Mabayoje
Scientific African | VOL. 24
Fatima E Usman-Hamza, et. al.Fatima E Usman-Hamza ... Modinat A Mabayoje
03 May 2024
Scientific African | VOL. 24

A Novel Telecom Customer Churn Analysis System Based on RFM Model and Feature Importance Ranking
Tianpei Xu ... Min Qu
Interdisciplinary Journal of Information, Knowledge, and Management | VOL. 18
Tianpei Xu, et. al.Tianpei Xu ... Min Qu
01 Jan 2023
Interdisciplinary Journal of Information, Knowledge, and Management | VOL. 18

Bank Customer Churn Prediction Based on Support Vector Machine: Taking a Commercial Bank's VIP Customer Churn as the Example
Jing Zhao ... Xing-Hua Dang
-
Jing Zhao, et. al.Jing Zhao ... Xing-Hua Dang
01 Oct 2008
01 Oct 2008

Comparative Multinomial Text Classification Analysis of Naïve Bayes and XGBoost with SMOTE on Imbalanced Dataset
Ashish Chaturvedi ... Santosh Yadav
-
Ashish Chaturvedi, et. al.Ashish Chaturvedi ... Santosh Yadav
05 Sep 2021
05 Sep 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Customer churn prediction in imbalanced datasets with resampling methods: A comparative study

Abstract

Talk to us

Similar Papers

More From: Expert Systems with Applications