Back to Search Start Over

FRAUD PREDICTION IN FINANCIAL INDUSTRY USING MACHINE LEARNING: K-MEANS CLUSTERING AND RANDOM FOREST DECISION TREES.

Authors :
PIVAR, Jasmina
RADNIĆ, Lucija
Source :
FEB Zagreb International Odyssey Conference on Economics & Business; Jul2024, Vol. 6 Issue 1, p327-340, 14p
Publication Year :
2024

Abstract

Machine learning methods can be adopted in the financial industry to detect variables that explain fraud and identify customers at risk of fraud. The purpose of this paper is two-fold. The first goal is to evaluate the potential of the combined approach encompassing k-means cluster analysis and Random Forest Trees binary classification algorithm in predicting fraud in the financial industry. The second goal is to detect characteristics of customers at higher risk of fraudulent behavior and predict fraud in banking. A dataset of customers from a banking institution is used. First, data was prepared. Second, we use k-means cluster analysis to identify bank customer segments and a cluster with the highest fraud occurrence. Third, four tree binary classification algorithms, including Random Forest Trees, Boosted Trees, CHAID trees and C&RT trees, have been used on the full dataset and on cluster members' data to determine which performs the best in predicting fraud. Since Random Forest performed the best, it was used to develop a classification model to identify fraud determinants in the entire dataset and in the Cluster 1 dataset, which has the highest fraud occurrence. Finally, we explained Random Forest Trees developed for the entire dataset and Cluster 1. Research shows that k-means and decision trees can successfully be deployed in customer segmentation and fraud prediction of bank customers. Random Forest performed the best in predicting fraud for the entire set of customers and for Cluster 1, which showed the highest fraud occurrence. Self-employment, marriage status, loan amount and credit history are important predictors of fraud for a full set of customers and members of a chosen segment of customers. This research has theoretical implications in the form of using a combined approach to fraud prediction. Results will serve as practical and managerial input for decision-making in the financial industry. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
2671132X
Volume :
6
Issue :
1
Database :
Supplemental Index
Journal :
FEB Zagreb International Odyssey Conference on Economics & Business
Publication Type :
Conference
Accession number :
179708702