Start Over

Utilizing data sampling techniques on algorithmic fairness for customer churn prediction with data imbalance problems.

Authors :: Maw M
Haw SC
Ho CK
Source :: F1000Research [F1000Res] 2021 Sep 30; Vol. 10, pp. 988. Date of Electronic Publication: 2021 Sep 30 (Print Publication: 2021).
Publication Year :: 2021
Abstract: Background: Customer churn prediction (CCP) refers to detecting which customers are likely to cancel the services provided by a service provider, for example, internet services. The class imbalance problem (CIP) in machine learning occurs when there is a huge difference in the samples of the positive class compared to the negative class. It is one of the major obstacles in CCP as it deteriorates performance in the classification process. Utilizing data sampling techniques (DSTs) helps to resolve the CIP to some extent. Methods: In this paper, we review the effect of using DSTs on algorithmic fairness, i.e., to investigate whether the results pose any discrimination between male and female groups and compare the results before and after using DSTs. Three real-world datasets with unequal balancing rates were prepared and four ubiquitous DSTs were applied to them. Six popular classification techniques were utilized in the classification process. Both classifier's performance and algorithmic fairness are evaluated with notable metrics. Results: The results indicated that the Random Forest classifier outperforms other classifiers in all three datasets and, that using SMOTE and ADASYN techniques causes more discrimination in the female group. The rate of unintentional discrimination seems to be higher in the original data of extremely unbalanced datasets under the following classifiers: Logistics Regression, LightGBM, and XGBoost. Conclusions: Algorithmic fairness has become a broadly studied area in recent years, yet there is very little systematic study on the effect of using DSTs on algorithmic fairness. This study presents important findings to further the use of algorithmic fairness in CCP research.<br />Competing Interests: No competing interests were disclosed.<br /> (Copyright: © 2022 Maw M et al.)

Subjects :: Female
Humans
Logistic Models
Male
Machine Learning

Details

Language :: English
ISSN :: 2046-1402
Volume :: 10
Database :: MEDLINE
Journal :: F1000Research
Publication Type :: Academic Journal
Accession number :: 36071889.2
Full Text :: https://doi.org/10.12688/f1000research.72929.2

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Utilizing data sampling techniques on algorithmic fairness for customer churn prediction with data imbalance problems.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Utilizing data sampling techniques on algorithmic fairness for customer churn prediction with data imbalance problems.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources