Modelling Customer Attrition in the Retail Banking Sector
Conference
65th ISI World Statistics Congress 2025
Format: CPS Abstract - WSC 2025
Keywords: machine learning
Session: CPS 50 - Machine Learning in Banking and Finance
Tuesday 7 October 4 p.m. - 5 p.m. (Europe/Amsterdam)
Session: CPS 50 - Machine Learning in Banking and Finance
Tuesday 7 October 5:10 p.m. - 6:10 p.m. (Europe/Amsterdam)
Abstract
High customer attrition rates are detrimental to businesses as it costs 5-7 times more to acquire customers than to retain them. Due to the nature of banking where customers are a banks biggest asset high rates are even more detrimental, with studies showing that this can cause banks to lose 10-15% of their revenue annually and even result in loss of market share when the net customer growth fails to exceed that of the market. However, studies also show that customer attrition is reversible through analytic and predictive tools. Thus, this study aimed to explore the efficacy of Machine Learning (ML) approaches in predicting customer attrition for South African Bank X and identifying the most important predictors. Logistic, Ridge, LASSO, Elastic-Net and GAM models were considered for this study in addition to Neural Networks and Random Forest models. To evaluate these models’ performance the statistical measures (specificity, sensitivity, accuracy, and area under the ROC curve) were observed. Based on unique performance evaluations, the best-performing machine learning model was identified to be the Random Forest model. Following order of importance; average demand deposit account balance, instalment amount, average credit turnover, customer risk group (based on credit), personal loan indicator (whether a customer has taken a personal loan or not), and average number of transactions were the top 5 major factors affecting attrition of customers in this study. In conclusion, the Random Forest Machine Learning model produces better predictive power for predicting attrition of customers and may help to improve decision-making in this regard. Furthermore, this study can give researchers a better picture of the alternative methods that can be used in predicting the binary outcome variable.