311 results on '"Credit Scoring"'
Search Results
2. Advancing Financial Inclusion and Data Ethics: The Role of Alternative Credit Scoring
- Author
-
Machikape, Keoitshepile, Oluwadele, Deborah, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Hinkelmann, Knut, editor, and Smuts, Hanlie, editor
- Published
- 2025
- Full Text
- View/download PDF
3. NOTE: non-parametric oversampling technique for explainable credit scoring.
- Author
-
Han, Seongil, Jung, Haemin, Yoo, Paul D., Provetti, Alessandro, and Cali, Andrea
- Subjects
- *
MACHINE learning , *GENERATIVE adversarial networks , *FINANCIAL institutions , *DATA modeling , *ARTIFICIAL intelligence - Abstract
Credit scoring models are critical for financial institutions to assess borrower risk and maintain profitability. Although machine learning models have improved credit scoring accuracy, imbalanced class distributions remain a major challenge. The widely used Synthetic Minority Oversampling TEchnique (SMOTE) struggles with high-dimensional, non-linear data and may introduce noise through class overlap. Generative Adversarial Networks (GANs) have emerged as an alternative, offering the ability to model complex data distributions. Conditional Wasserstein GANs (cWGANs) have shown promise in handling both numerical and categorical features in credit scoring datasets. However, research on extracting latent features from non-linear data and improving model explainability remains limited. To address these challenges, this paper introduces the Non-parametric Oversampling Technique for Explainable credit scoring (NOTE). The NOTE offers a unified approach that integrates a Non-parametric Stacked Autoencoder (NSA) for capturing non-linear latent features, cWGAN for oversampling the minority class, and a classification process designed to enhance explainability. The experimental results demonstrate that NOTE surpasses state-of-the-art oversampling techniques by improving classification accuracy and model stability, particularly in non-linear and imbalanced credit scoring datasets, while also enhancing the explainability of the results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. A hybrid metaheuristic optimised ensemble classifier with self organizing map clustering for credit scoring.
- Author
-
Singh, Indu, Kothari, D. P., Aditya, S., Rajora, Mihir, Agarwal, Charu, and Gautam, Vibhor
- Abstract
Credit scoring is a mathematical and statistical tool that aids financial institutions in deciding suitable candidates for the issuance of loans, based on the analysis of the borrower’s financial history. Distinct groups of borrowers have unique characteristics that must be identified and trained on to increase the accuracy of classification models for all credit borrowers that financial institutions serve. Numerous studies have shown that models based on diverse base-classifier models outperform other statistical and AI-based techniques for related classification problems. This paper proposes a novel multi-layer clustering and soft-voting-based ensemble classification model, aptly named Self Organizing Map Clustering with Metaheuristic Voting Ensembles (SCMVE) which uses a self-organizing map for clustering the data into distinct clusters with their unique characteristics and then trains a sailfish optimizer powered ensemble of SVM-KNN base classifiers for classification of each distinct identified cluster. We train and evaluate our model on the standard public credit scoring datasets—namely the German, Australian and Taiwan datasets and use multiple evaluation scores such as precision, F1 score, recall to compare the results of our model with other prominent works in the field. On evaluation, SCMVE shows outstanding results (95% accuracy on standard datasets) when compared with popular works in the field of credit scoring. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Tausi: A Holistic Artificial Intelligence Approach to Credit Scoring Using Informal Data for a Sustainable Micro-lending African Economy
- Author
-
Kazimoto, Derick, Baadel, Said, David, Davis, Mutahaba, Rwebu, Rugumyamheto, Jerome, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, and Arai, Kohei, editor
- Published
- 2024
- Full Text
- View/download PDF
6. Risk Scorecards Using Alternative Sources of Data for Credit Risk Applications
- Author
-
Dwivedi, Dwijendra, Batra, Saurabh, Pathak, Yogesh Kumar, Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, Tripathi, Ashish Kumar, editor, and Anand, Darpan, editor
- Published
- 2024
- Full Text
- View/download PDF
7. New Paradigm in Financial Technology Using Machine Learning Techniques and Their Applications
- Author
-
Patnaik, Deepti, Patnaik, Srikanta, Kacprzyk, Janusz, Series Editor, Jain, Lakhmi C., Series Editor, Maglaras, Leandros A., editor, Das, Sonali, editor, Tripathy, Naliniprava, editor, and Patnaik, Srikanta, editor
- Published
- 2024
- Full Text
- View/download PDF
8. How Can Credit Scoring Benefit from Machine Learning? SWOT Analysis
- Author
-
Bentounsi, Oussama, Lahmini, Hajar Mouatassim, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Abraham, Ajith, editor, Bajaj, Anu, editor, Hanne, Thomas, editor, and Siarry, Patrick, editor
- Published
- 2024
- Full Text
- View/download PDF
9. Modeling Automobile Credit Scoring Using Machine Learning Models
- Author
-
Yiğit, Pakize, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, García Márquez, Fausto Pedro, editor, Jamil, Akhtar, editor, Hameed, Alaa Ali, editor, and Segovia Ramírez, Isaac, editor
- Published
- 2024
- Full Text
- View/download PDF
10. Deep Learning and Machine Learning Techniques for Credit Scoring: A Review
- Author
-
Demma Wube, Hana, Zekarias Esubalew, Sintayehu, Fayiso Weldesellasie, Firesew, Girma Debelee, Taye, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Debelee, Taye Girma, editor, Ibenthal, Achim, editor, Schwenker, Friedhelm, editor, and Megersa Ayano, Yehualashet, editor
- Published
- 2024
- Full Text
- View/download PDF
11. A Synthesis on Machine Learning for Credit Scoring: A Technical Guide
- Author
-
Akil, Siham, Sekkate, Sara, Adib, Abdellah, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Ben Ahmed, Mohamed, editor, Boudhir, Anouar Abdelhakim, editor, El Meouche, Rani, editor, and Karaș, İsmail Rakıp, editor
- Published
- 2024
- Full Text
- View/download PDF
12. Machine Learning in Finance Case of Credit Scoring
- Author
-
El Maanaoui, Driss, Jeaab, Khalid, Najmi, Hajare, Saoudi, Youness, Falloul, Moulay El Mehdi, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Farhaoui, Yousef, editor, Hussain, Amir, editor, Saba, Tanzila, editor, Taherdoost, Hamed, editor, and Verma, Anshul, editor
- Published
- 2024
- Full Text
- View/download PDF
13. Credit Risk Management in Microfinance: Application of Non-repayment Prediction Models
- Author
-
Nejjar, Chaymae, Kaicer, Mohammed, Haimer, Sara El, Idhmad, Azzeddine, Essairh, Loubna, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Ezziyyani, Mostafa, editor, and Balas, Valentina Emilia, editor
- Published
- 2024
- Full Text
- View/download PDF
14. The Rise of AI and ML in Financial Technology: An In-depth Study of Trends and Challenges
- Author
-
Jain, Rahul, Vanzara, Rakesh, Sarvakar, Ketan, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Tan, Kay Chen, Series Editor, Kumar, Amit, editor, and Mozar, Stefan, editor
- Published
- 2024
- Full Text
- View/download PDF
15. Incremental Machine Learning-Based Approach for Credit Scoring in the Age of Big Data
- Author
-
Museba, Tinofirei, Moloi, Tankiso, editor, and George, Babu, editor
- Published
- 2024
- Full Text
- View/download PDF
16. Credit scoring using machine learning and deep Learning-Based models
- Author
-
Sami Mestiri
- Subjects
credit scoring ,machine learning ,artificial intelligence ,model comparison ,personal loan ,Finance ,HG1-9999 ,Statistics ,HA1-4737 - Abstract
Credit scoring is a useful tool for assessing the capability of customers repayments. The purpose of this paper is to compare the predictive abilities of six credit scoring models: Linear Discriminant Analysis (LDA), Random Forests (RF), Logistic Regression (LR), Decision Trees (DT), Support Vector Machines (SVM) and Deep Neural Network (DNN). To compare these models, an empirical study was conducted using a sample of 688 observations and twelve variables. The performance of this model was analyzed using three measures: Accuracy rate, F1 score, and Area Under Curve (AUC). In summary, machine learning techniques exhibited greater accuracy in predicting loan defaults compared to other traditional statistical models.
- Published
- 2024
- Full Text
- View/download PDF
17. OptimizingEnsemble Learning to Reduce Misclassification Costs in Credit Risk Scorecards.
- Author
-
Martin, John, Taheri, Sona, and Abdollahian, Mali
- Subjects
- *
CREDIT risk , *MACHINE learning , *FINANCIAL institutions , *K-nearest neighbor classification , *COST - Abstract
Credit risk scorecard models are utilized by lending institutions to optimize decisions on credit approvals. In recent years, ensemble learning has often been deployed to reduce misclassification costs in credit risk scorecards. In this paper, we compared the risk estimation of 26 widely used machine learning algorithms based on commonly used statistical metrics. The best-performing algorithms were then used for model selection in ensemble learning. For the first time, we proposed financial criteria that assess the impact of losses associated with both false positive and false negative predictions to identify optimal ensemble learning. The German Credit Dataset (GCD) is augmented with simulated financial information according to a hypothetical mortgage portfolio observed in UK, European and Australian banks to enable the assessment of losses arising from misclassification costs. The experimental results using the simulated GCD show that the best predictive individual algorithm with the accuracy of 0.87, Gini of 0.88 and Area Under the Receiver Operating Curve of 0.94 was the Generalized Additive Model (GAM). The ensemble learning method with the lowest misclassification cost was the combination of Random Forest (RF) and K-Nearest Neighbors (KNN), totaling USD 417 million in costs (USD 230 for default costs and USD 187 for opportunity costs) compared to the costs of the GAM (USD 487, USD 287 and USD 200). Implementing the proposed financial criteria has led to a significant USD 70 million reduction in misclassification costs derived from a small sample. Thus, the lending institutions' profit would considerably rise as the number of submitted credit applications for approval increases. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Implementing and analyzing fairness in banking credit scoring.
- Author
-
Mariscal, Charlene, Yustiawan, Yoga, Rochim, Fauzy Caesar, and Tanuar, Evawaty
- Subjects
MACHINE learning ,BANK loans ,FAIRNESS ,CONSCIOUSNESS raising - Abstract
The decision made by machine learning is mostly based on historical data that is used to train them. It raises the awareness that discrimination in machine learning should be eliminated because it may contain societal bias. The financial industry uses credit scoring as a reference to reflect the customer risk profile. To achieve fairness in the model, this paper tries to: (1) assess bias and (2) improve fairness in machine learning models with three bias mitigation methodologies. This study depicts that there is a trade-off between improving fairness and preserving performance. Implementing post-processing methods, for example, Grid Search performs best. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. An interpretable decision tree ensemble model for imbalanced credit scoring datasets.
- Author
-
My, Bui T.T. and Ta, Bao Q.
- Subjects
- *
DECISION trees , *MACHINE learning , *RANDOM forest algorithms , *STATISTICAL learning - Abstract
Credit scoring is a typical example of imbalanced classification, which poses a challenge to conventional machine learning algorithms and statistical classifiers when attempting to accurately predict outcomes for defaulting customers. In this paper, we propose a credit scoring classifier called Decision Tree Ensemble model (DTE). This model effectively addresses the challenge of imbalanced data and identifies significant features that influence the likelihood of credit status. An experiment demonstrates that DTE exhibits superior performance metrics in comparison to well-known based-tree ensemble classifiers such as Bagging, Random Forest, and AdaBoost, particularly when integrated with resampling techniques for handling imbalanced data. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. A recent review on optimisation methods applied to credit scoring models.
- Author
-
Shohei Kamimura, Elias, Faia Pinto, Anderson Rogério, and Seido Nagano, Marcelo
- Subjects
- *
CREDIT ratings , *PROCESS optimization , *MACHINE learning , *ARTIFICIAL neural networks , *LITERATURE reviews , *INSTALLMENT plan - Abstract
Purpose – This paper aims to present a literature review of the most recent optimisation methods applied to Credit Scoring Models (CSMs). Design/methodology/approach – The research methodology employed technical procedures based on bibliographic and exploratory analyses. A traditional investigation was carried out using the Scopus, ScienceDirect and Web of Science databases. The papers selection and classification took place in three steps considering only studies in English language and published in electronic journals (from 2008 to 2022). The investigation led up to the selection of 46 publications (10 presenting literature reviews and 36 proposing CSMs). Findings – The findings showed that CSMs are usually formulated using Financial Analysis, Machine Learning, Statistical Techniques, Operational Research and Data Mining Algorithms. The main databases used by the researchers were banks and the University of California, Irvine. The analyses identified 48 methods used by CSMs, the main ones being: Logistic Regression (13%), Naive Bayes (10%) and Artificial Neural Networks (7%). The authors conclude that advances in credit score studies will require new hybrid approaches capable of integrating Big Data and Deep Learning algorithms into CSMs. These algorithms should have practical issues considered consider practical issues for improving the level of adaptation and performance demanded for the CSMs. Practical implications – The results of this study might provide considerable practical implications for the application of CSMs. As it was aimed to demonstrate the application of optimisation methods, it is highly considerable that legal and ethical issues should be better adapted to CSMs. It is also suggested improvement of studies focused on micro and small companies for sales in instalment plans and commercial credit through the improvement or new CSMs. Originality/value – The economic reality surrounding credit granting has made risk management a complex decision-making issue increasingly supported by CSMs. Therefore, this paper satisfies an important gap in the literature to present an analysis of recent advances in optimisation methods applied to CSMs. The main contribution of this paper consists of presenting the evolution of the state of the art and future trends in studies aimed at proposing better CSMs. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
21. An Adaptive and Dynamic Heterogeneous Ensemble Model for Credit Scoring
- Author
-
Museba, Tinofirei, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Ndayizigamiye, Patrick, editor, Twinomurinzi, Hossana, editor, Kalema, Billy, editor, Bwalya, Kelvin, editor, and Bembe, Mncedisi, editor
- Published
- 2023
- Full Text
- View/download PDF
22. Mathematical Modeling and Analysis of Credit Scoring Using the LIME Explainer: A Comprehensive Approach.
- Author
-
Aljadani, Abdussalam, Alharthi, Bshair, Farsi, Mohammed A., Balaha, Hossam Magdy, Badawy, Mahmoud, and Elhosseini, Mostafa A.
- Subjects
- *
CREDIT ratings , *CREDIT analysis , *PARTICLE swarm optimization , *MATHEMATICAL analysis , *DECISION trees , *CLASSIFICATION algorithms , *MATHEMATICAL models , *MACHINE learning - Abstract
Credit scoring models serve as pivotal instruments for lenders and financial institutions, facilitating the assessment of creditworthiness. Traditional models, while instrumental, grapple with challenges related to efficiency and subjectivity. The advent of machine learning heralds a transformative era, offering data-driven solutions that transcend these limitations. This research delves into a comprehensive analysis of various machine learning algorithms, emphasizing their mathematical underpinnings and their applicability in credit score classification. A comprehensive evaluation is conducted on a range of algorithms, including logistic regression, decision trees, support vector machines, and neural networks, using publicly available credit datasets. Within the research, a unified mathematical framework is introduced, which encompasses preprocessing techniques and critical algorithms such as Particle Swarm Optimization (PSO), the Light Gradient Boosting Model, and Extreme Gradient Boosting (XGB), among others. The focal point of the investigation is the LIME (Local Interpretable Model-agnostic Explanations) explainer. This study offers a comprehensive mathematical model using the LIME explainer, shedding light on its pivotal role in elucidating the intricacies of complex machine learning models. This study's empirical findings offer compelling evidence of the efficacy of these methodologies in credit scoring, with notable accuracies of 88.84%, 78.30%, and 77.80% for the Australian, German, and South German datasets, respectively. In summation, this research not only amplifies the significance of machine learning in credit scoring but also accentuates the importance of mathematical modeling and the LIME explainer, providing a roadmap for practitioners to navigate the evolving landscape of credit assessment. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
23. Swarm learning based credit scoring for P2P lending in block chain.
- Author
-
John, Antony Prince, Devaraj, Jagadhiswaran, Gandhimaruthian, Lathaselvi, and Liakath, Javid Ali
- Subjects
CREDIT ratings ,PEER-to-peer lending ,MACHINE learning ,REPAYMENTS ,WEIGHT training ,COMPUTER network architectures ,BLOCKCHAINS - Abstract
Conventional loan avenues generally focus more on the formal sector than the unbanked sector. A peer to-peer (p2p) lending platform built on blockchain can help bridge the gap between potential lenders and borrowers in need of money in a secure and decentralized environment. The Ethereum blockchain allows for the creation of smart contracts to perform actions in the network by setting logic rules and conditions thereby removing the need for middlemen and can be inclusive of the unbanked sector. The p2p platform introduces swarm learning for credit scoring, which is a novel methodology that utilizes smart contracts to train decentralized machine learning models. Each training round happens on the local device with the user data, which then exchanges the training parameters and weights to the machine learning model maintained in the smart contract. This allows for preserving the privacy of the user data by ensuring the data never leaves the device but only the inference does. Upon analyzing the user's behavior, a statistical credit score is assessed for validating the chances of the user to default his/her loan repayment. The performance of the proposed model that has been trained using the swarm learning technique is close to the model that had been trained in a centralized environment while overcoming the drawbacks of federated learning by incorporating blockchain and swarm methodology. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
24. Macroeconomic Adverse Selection in Machine Learning Models of Credit Risk †.
- Author
-
Breeden, Joseph L. and Leonova, Yevgeniya
- Subjects
MACROECONOMICS ,MACHINE learning ,ADVERSE selection (Commerce) ,ARTIFICIAL neural networks ,NONLINEAR analysis - Abstract
Macroeconomic adverse selection is computed as a time series of forecast residuals via the vintage origination model for an industry dataset of auto loans. The adverse selection time series are computed separately as model residuals using logistic regression, neural networks, and stochastic gradient boosted trees to predict defaults in the first 24 months of a loan. Panel data versions of these models with lifecycle and environment inputs from a segmented Age-Period-Cohort analysis were also estimated. The estimates show that panel data methods make better use of available data to provide faster estimates of adverse selection risk in recent vintages and incorporate defaults at any age of the loan. The nonlinear modeling advantages of neural networks and stochastic gradient boosted trees did not significantly alter the estimates of adverse selection. Overall, all methods confirmed that macroeconomic adverse selection was dramatically higher in 2021 and 2022 for US auto loan originations. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
25. Marketing Analytics, AI and Machine Learning: MACHINE LEARNING MODELS FOR CREDIT SCORING AT HIGHER EDUCATION INSTITUTIONS.
- Author
-
Niskier Saadia, Giovanna, Brantes Ferreira, Jorge, Rodriguez Whately, Ricardo, Leão Ramos, Fernanda, and Ferreira da Silva, Jorge
- Subjects
ARTIFICIAL intelligence ,MACHINE learning ,CREDIT scoring systems ,UNIVERSITIES & colleges ,HIGHER education - Abstract
Default is a major problem for private higher education institutions (HEI) and can result in school dropout and loss of revenue. This work aims to propose and test a credit scoring model using machine learning techniques in the private higher education sector, estimating the default risk of each student. [ABSTRACT FROM AUTHOR]
- Published
- 2023
26. Application of Supplemental Sampling and Interpretable AI in Credit Scoring for Canadian Fintechs: Methods and Case Studies
- Author
-
Shen, Yi, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Chen, Weitong, editor, Yao, Lina, editor, Cai, Taotao, editor, Pan, Shirui, editor, Shen, Tao, editor, and Li, Xue, editor
- Published
- 2022
- Full Text
- View/download PDF
27. Artificial Intelligence in Credit, Lending, and Mortgage
- Author
-
Chan, Leong, Hogaboam, Liliya, Cao, Renzhi, Daim, Tugrul U., Series Editor, Dabić, Marina, Series Editor, Chan, Leong, Hogaboam, Liliya, and Cao, Renzhi
- Published
- 2022
- Full Text
- View/download PDF
28. Predicting Credit Scores with Boosted Decision Trees
- Author
-
João A. Bastos
- Subjects
forecasting ,credit scoring ,credit risk ,boosted decision trees ,machine learning ,Science (General) ,Q1-390 ,Mathematics ,QA1-939 - Abstract
Credit scoring models help lenders decide whether to grant or reject credit to applicants. This paper proposes a credit scoring model based on boosted decision trees, a powerful learning technique that aggregates several decision trees to form a classifier given by a weighted majority vote of classifications predicted by individual decision trees. The performance of boosted decision trees is evaluated using two publicly available credit card application datasets. The prediction accuracy of boosted decision trees is benchmarked against two alternative machine learning techniques: the multilayer perceptron and support vector machines. The results show that boosted decision trees are a competitive technique for implementing credit scoring models.
- Published
- 2022
- Full Text
- View/download PDF
29. Enhancing decision making with machine learning: The case of aurora crowdlending platform.
- Author
-
Jutasompakorn, Pearpilai, Perdana, Arif, and Balachandran, Vivek
- Abstract
The crowdlending industry is a fast-growing financial technology (fintech) sector that brings together borrowers and lenders. As an alternative financial intermediary, the crowdlending industry plays an essential role in reducing the financial exclusion of small and medium-sized enterprises (SMEs) struggling to obtain funds from traditional financial intermediaries such as commercial banks. With the onset of Covid-19 and the deteriorating economies worldwide, Singapore crowdlending platforms have come under pressure due to the increasing default rate of their borrowers. This case study illuminates the challenges faced by Aurora (pseudonym), a crowdlending platform that operates in Singapore, Indonesia, and Malaysia. In response to high default rates during Covid-19, Aurora's management made improvement to its current machine learning-based credit scoring model in June 2021. This case study describes the challenges Aurora faced in identifying relevant features for the machine learning model, data preparation and cleansing, and selecting the appropriate credit model algorithms to replace its current approval process. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. A General Architecture for a Trustworthy Creditworthiness-Assessment Platform in the Financial Domain.
- Author
-
Cornacchia, Giandomenico, Anelli, Vito Walter, Narducci, Fedelucio, Ragone, Azzurra, and Di Sciascio, Eugenio
- Subjects
MACHINE learning ,TRUST ,ARTIFICIAL intelligence ,RATE of return ,CUSTOMER services ,ABUSE of older people - Abstract
The financial domain is making huge advancements thanks to the exploitation of artificial intelligence. As an example, the credit-worthiness-assessment task is now strongly based on Machine Learning algorithms that make decisions independently from humans. Several studies showed remarkable improvement in reliability, customer care, and return on investment. Nonetheless, many users remain sceptical since they perceive the whole as only partially transparent. The trust in the system decision, the guarantee of fairness in the decision-making process, the explanation of the reasons behind the decision are just some of the open challenges for this task. Moreover, from the financial institution's perspective, another compelling problem is credit-repayment monitoring. Even here, traditional models (e.g., credit scorecards) and machine learning models can help the financial institution in identifying, at an early stage, customers that will fall into default on payments. The monitoring task is critical for the debt-repayment success of identifying bad debtors or simply users who are momentarily in difficulty. The financial institution can thus prevent possible defaults and, if possible, meet the debtor's needs. In this work, the authors propose an architecture for a Creditworthiness-Assessment duty that can meet the transparency needs of the customers while monitoring the credit-repayment risk. This preliminary study carried out an experimental evaluation of the component devoted to the credit-score computation and monitoring credit repayments. The study shows that the authors' architecture can be an effective tool to improve current Credit-scoring systems. Combining a static and a subsequent dynamic approach can correct mistakes made in the first phase and foil possible false positives for good creditors. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
31. Feature contribution alignment with expert knowledge for artificial intelligence credit scoring.
- Author
-
El Qadi, Ayoub, Trocan, Maria, Díaz-Rodríguez, Natalia, and Frossard, Thomas
- Abstract
Credit assessments activities are essential for financial institutions and allow the global economy to grow. Building robust, solid and accurate models that estimate the probability of a default of a company is mandatory for credit insurance companies, specially when it comes to bridging the trade finance gap. The recent developments in Artificial Intelligence are offering new powerful opportunities. However, most AI techniques are labeled as black-box models due to their lack of explainability. For both users and regulators, in order to deploy such technologies at scale, being able to understand the model logic is a must to grant accurate and ethical decision making. In this study, we focus on companies credit scoring and we benchmark different machine learning models. The aim is to build a model to predict whether a company will experience financial problems in a given time horizon. We address the black-box problem using eXplainable Artificial Techniques—in particular, post hoc explanations using SHapley Additive exPlanations. We bring light by providing an expert-aligned feature relevance score highlighting the disagreement between a credit risk expert and a model feature attribution explanation in order to better quantify the convergence toward a better human-aligned decision making. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
32. Some Insights about the Applicability of Logistic Factorisation Machines in Banking.
- Author
-
Slabber, Erika, Verster, Tanja, and de Jongh, Riaan
- Subjects
FACTORIZATION ,CREDIT card fraud ,AUTOMATED teller machines ,RANDOM forest algorithms ,RECOMMENDER systems ,FINANCIAL services industry ,COMPUTERS in education - Abstract
Logistic regression is a very popular binary classification technique in many industries, particularly in the financial service industry. It has been used to build credit scorecards, estimate the probability of default or churn, identify the next best product in marketing, and many more applications. The machine learning literature has recently introduced several alternative techniques, such as deep learning neural networks, random forests, and factorisation machines. While neural networks and random forests form part of the practitioner's model-building toolkit, factorisation machines are seldom used. In this paper, we investigate the applicability of factorisation machines to some binary classification problems in banking. To stimulate the practical application of factorisation machines, we implement the fitting routines, based on logit loss and maximum likelihood, on commercially available software that is widely used by banks and other large financial services companies. Logit loss is usually used by the machine learning fraternity while maximum likelihood is popular in statistics. Depending on the coding of the target variable, we will show that these methods yield identical parameter estimates. Often, banks are confronted with predicting events that occur with low probability. To deal with this phenomenon, we introduce weights in the above-mentioned loss functions. The accuracy of our fitting algorithms is then studied by means of a simulation study and compared with logistic regression. The separation and prediction performance of factorisation machines are then compared to logistic regression and random forests by means of three case studies covering a recommender system, credit card fraud, and a credit scoring application. We conclude that logistic factorisation machines are worthy competitors of logistic regression in most applications, but with clear advantages in recommender systems applications where the number of predictors typically outnumbers the number of observations. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
33. Network-aware credit scoring system for telecom subscribers using machine learning and network analysis
- Author
-
Gao, Hongming, Liu, Hongwei, Ma, Haiying, Ye, Cunjun, and Zhan, Mingjun
- Published
- 2022
- Full Text
- View/download PDF
34. Scaling up SMEs' credit scoring scope with LightGBM.
- Author
-
Lextrait, Bastien
- Subjects
CREDIT ratings ,CREDIT scoring systems ,SMALL business ,CREDIT risk ,CAPITAL requirements ,COMMERCIAL courts ,DEFAULT (Finance) ,DECISION trees ,MACHINE learning - Abstract
Small and Medium Size enterprises (SMEs) are critical actors in the fabric of the economy. Their growth is often limited by the difficulty in obtaining financing. Basel II accords enforced the obligation for banks to estimate the probability of default of their obligors. Currently used models are limited by the simplicity of their architecture and the available data. State of the art machine learning models are not widely used because they are often considered as black boxes that cannot be easily explained or interpreted. We propose a methodology to combine high predictive power and powerful explainability using various Gradient Boosting Decision Trees (GBDT) implementations and Shapley additive explanation (SHAP) values as post-prediction explanation model. This method is developed and tested using a nation-wide sample of French companies, and a history of past failures extracted from commercial court decisions. The performances of GBDT models are compared with traditional credit scoring algorithms. GBDT provides the best performances over the test sample, while being fast to train and economically sound. Results obtained from SHAP values analysis are consistent with previous socio-economic studies. Providing such a level of explainability to complex models may convince regulators to accept their use in automated credit scoring. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
35. How much do we see? On the explainability of partial dependence plots for credit risk scoring.
- Author
-
Szepannek, Gero and Lübke, Karsten
- Subjects
CREDIT scoring systems ,MACHINE learning ,ARTIFICIAL intelligence ,CREDIT analysis - Abstract
Risk prediction models in credit scoring have to fulfil regulatory requirements, one of which consists in the interpretability of the model. Unfortunately, many popular modern machine learning algorithms result in models that do not satisfy this business need, whereas the research activities in the field of explainable machine learning have strongly increased in recent years. Partial dependence plots denote one of the most popular methods for model-agnostic interpretation of a feature's effect on the model outcome, but in practice they are usually applied without answering the question of how much can actually be seen in such plots. For this purpose, in this paper a methodology is presented in order to analyse to what extent arbitrary machine learning models are explainable by partial dependence plots. The proposed framework provides both a visualisation, as well as a measure to quantify the explainability of a model on an understandable scale. A corrected version of the German credit data, one of the most popular data sets of this application domain, is used to demonstrate the proposed methodology. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
36. Predicting Credit Scores with Boosted Decision Trees.
- Author
-
Bastos, João A.
- Subjects
CREDIT ratings ,DECISION trees ,MACHINE learning ,SUPPORT vector machines ,CREDIT cards - Abstract
Credit scoring models help lenders decide whether to grant or reject credit to applicants. This paper proposes a credit scoring model based on boosted decision trees, a powerful learning technique that aggregates several decision trees to form a classifier given by a weighted majority vote of classifications predicted by individual decision trees. The performance of boosted decision trees is evaluated using two publicly available credit card application datasets. The prediction accuracy of boosted decision trees is benchmarked against two alternative machine learning techniques: the multilayer perceptron and support vector machines. The results show that boosted decision trees are a competitive technique for implementing credit scoring models. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
37. A novel profit-driven framework for model evaluation in credit scoring.
- Author
-
Mohammadnejad-Daryani, Hossein, Taleizadeh, Ata Allah, and Pamucar, Dragan
- Subjects
- *
MACHINE learning - Published
- 2024
- Full Text
- View/download PDF
38. Improving Credit Scoring Technology and Stability of a Commercial Bank
- Author
-
Vishnever, V. Ya., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Ashmarina, Svetlana Igorevna, editor, Mantulenko, Valentina Vyacheslavovna, editor, and Vochozka, Marek, editor
- Published
- 2021
- Full Text
- View/download PDF
39. Privacy-Preserving Credit Scoring via Functional Encryption
- Author
-
Andolfo, Lorenzo, Coppolino, Luigi, D’Antonio, Salvatore, Mazzeo, Giovanni, Romano, Luigi, Ficke, Matthew, Hollum, Arne, Vaydia, Darshan, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Gervasi, Osvaldo, editor, Murgante, Beniamino, editor, Misra, Sanjay, editor, Garau, Chiara, editor, Blečić, Ivan, editor, Taniar, David, editor, Apduhan, Bernady O., editor, Rocha, Ana Maria A. C., editor, Tarantino, Eufemia, editor, and Torre, Carmelo Maria, editor
- Published
- 2021
- Full Text
- View/download PDF
40. Machine Learning-Based Empirical Investigation for Credit Scoring in Vietnam’s Banking
- Author
-
Tran, Khanh Quoc, Duong, Binh Van, Tran, Linh Quang, Tran, An Le-Hoai, Nguyen, An Trong, Nguyen, Kiet Van, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Fujita, Hamido, editor, Selamat, Ali, editor, Lin, Jerry Chun-Wei, editor, and Ali, Moonis, editor
- Published
- 2021
- Full Text
- View/download PDF
41. Model-Agnostic Counterfactual Explanations in Credit Scoring
- Author
-
Xolani Dastile, Turgay Celik, and Hans Vandierendonck
- Subjects
Credit scoring ,machine learning ,counterfactual explanation ,explainable AI ,genetic algorithm ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The past decade has shown a surge in the use and application of machine learning and deep learning models across various domains. One such domain is credit scoring, where applicants are scored to assess their creditworthiness for loan applications. It is essential to ensure that no biases or discriminations are incurred during the scoring process. Most machine learning and deep learning models are prone to unintended bias and discrimination in the datasets. Therefore, it is imperative to explain each prediction from the models during the scoring process to avoid the element of model bias and discrimination. Our study proposes a novel optimization formulation that generates sparse counterfactual explanations via a custom genetic algorithm to explain the black-box model’s predictions. We evaluated the efficacy of the proposed method on publicly available credit scoring datasets by comparing the counterfactual explanations generated by the proposed method with explanations from credit scoring experts. The proposed counterfactual explanation method does not only explain rejected loan applications but also can be used to explain approved loan applications.
- Published
- 2022
- Full Text
- View/download PDF
42. An Online Transfer Learning Framework With Extreme Learning Machine for Automated Credit Scoring
- Author
-
Rana Alasbahi and Xiaolin Zheng
- Subjects
Credit scoring ,machine learning ,extreme learning machine ,probability of default ,missing features ,data irregularities ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Automated Credit Scoring (ACS) is the process of predicting user credit based on historical data. It involves analyzing and predicting the association between the data and particular credit values based on similar data. Recently, ACS has been handled as a machine learning problem, and numerous models were developed to address it. In this paper, we address ACS issues concerning credit scoring in a batch of machine learning problems, namely, feature irregularities due to empty features in many records, class imbalance due to non-uniform statistical distributions of the records between classes, and concept drift due to changing statistical characteristics concerning certain classes and features with time. Considering the limited credit scoring data volume, we propose to address the challenge using the Transfer Learning with Lag (TLL) algorithm based on embedded shallow neural networks that enable knowledge transfer when the number of active features changes. Knowledge transfer is based on lags having an adaptive length that is changed based on performance change feedback. Furthermore, the framework proposes classifier aggregation and the chunk balancing mechanism for handling class imbalance. An evaluation was conducted using the Lending club, German, Default, and PPDai datasets. The results show the superiority of the proposed algorithm over the benchmarks in terms of the majority of classification metrics concerning both time series and overall results.
- Published
- 2022
- Full Text
- View/download PDF
43. An construction method of scorecard using machine learning and logical regression.
- Author
-
Zhu, Zhengxiang, Sun, Junwen, and Li, Xingsen
- Subjects
CREDIT risk ,MACHINE learning ,CREDIT ratings ,FINANCIAL institutions - Abstract
Scorecard is the main method used by financial institutions to quantitatively assess customer credit risk. The traditional scorecard mainly uses the logical regression (LR) for modeling, although it is good in interpretation and stability, it is not suitable for processing large-scale samples, and its accuracy is low. Meanwhile, with the further study in recent years, machine learning has gradually begun to be applied to high-dimensional large-scale sample modeling in the financial field. However, machine learning also has problems such as poor interpretability and weak generalization ability. This paper proposes to build an integrated model of machine learning and logical regression, and makes full use of the advantages of the two algorithms to develop a new scorecard model. The practice shows that the new scorecard model has good differentiation ability. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
44. A Deep Learning Approach for Credit Scoring Using Feature Embedded Transformer.
- Author
-
Wang, Chongren and Xiao, Zhuoyi
- Subjects
DEEP learning ,CREDIT ratings ,RECEIVER operating characteristic curves - Abstract
In this paper, we introduce a transformer into the field of credit scoring based on user online behavioral data and develop an end-to-end feature embedded transformer (FE-Transformer) credit scoring approach. The FE-Transformer neural network is composed of two parts: a wide part and a deep part. The deep part uses the transformer deep neural network. The output of the deep neural network and the feature data of the wide part are concentrated in a fusion layer. The experimental results show that the FE-Transformer deep learning model proposed in this paper outperforms the LR, XGBoost, LSTM, and AM-LSTM comparison methods in terms of area under the receiver operating characteristic curve (AUC) and the Kolmogorov–Smirnov (KS). This shows that the FE-Transformer deep learning model proposed in this paper can accurately predict user default risk. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
45. Credit scoring methods: Latest trends and points to consider.
- Author
-
Markov, Anton, Seleznyova, Zinaida, and Lapshin, Victor
- Subjects
CREDIT risk management ,MACHINE learning ,DATA mining ,CREDIT scoring systems ,PERFORMANCE evaluation - Abstract
Credit risk is the most significant risk by impact for any bank and financial institution. Accurate credit risk assessment affects an organisation's balance sheet and income statement, since credit risk strategy determines pricing, and might even influence seemingly unrelated domains, e.g. marketing, and decision-making. This article aims at providing a systemic review of the most recent (2016--2021) articles, identifying trends in credit scoring using a fixed set of questions. The survey methodology and questionnaire align with previous similar research that analyses articles on credit scoring published in 1991--2015. We seek to compare our results with previous periods and highlight some of the recent best practices in the field that might be useful for future researchers. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
46. Emerging Trends in Deep Learning for Credit Scoring: A Review.
- Author
-
Hayashi, Yoichi
- Subjects
DEEP learning ,CREDIT ratings ,RECEIVER operating characteristic curves ,CONVOLUTIONAL neural networks ,MACHINE learning - Abstract
This systematic review aims to provide deep insights on emerging trends in, and the potential of, advanced deep learning techniques, such as machine learning algorithms being partially replaced by deep learning (DL) algorithms for credit scoring owing to the higher accuracy of the latter. This review also seeks to explain the reasons that deep belief networks (DBNs) can achieve higher accuracy than shallower networks, discusses the potential classification capabilities of DL-based classifiers, and bridges DL and explainable credit scoring. The theoretical characteristics of DBNs are also presented along with the reasons for their higher accuracy compared to that of shallower networks. Studies published between 2019 and 2022 were analysed to review and compare the most recent DL techniques that have been found to achieve higher accuracies than ensemble classifiers, their hybrids, rule extraction methods, and rule-based classifiers. The models reviewed in this study were evaluated and compared according to their accuracy and area under the receiver operating characteristic curve for the Australian, German (categorical), German (numerical), Japanese, and Taiwanese datasets, which are commonly used in the credit scoring community. This review paper also explains how tabular datasets are converted into images for the application of a two-dimensional convolutional neural network (CNN) and how "black box" models using local and global rule extraction and rule-based methods are applied in credit scoring. Finally, a new insight on the design of DL-based classifiers for credit scoring datasets is provided, along with a discussion on promising future research directions. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
47. CONSIDERATION OF THE POSSIBILITIES OF APPLYING MACHINE LEARNING METHODS FOR DATA ANALYSIS WHEN PROMOTING SERVICES TO BANK’S CLIENTS.
- Author
-
Bulhakova, Olha, Ulianovska, Yuliia, Kostenko, Victoria, and Rudyanova, Tatyana
- Subjects
- *
PYTHON programming language , *MACHINE learning , *RANDOM forest algorithms , *DATA analysis , *RECURRENT neural networks , *BOOSTING algorithms , *SERVICE learning , *ELECTRONIC data processing - Abstract
The object of the research is modern online services and machine learning libraries for predicting the probability of the bank client’s consent to the provision of the proposed services. One of the most problematic areas is the high unpredictability of the result in the field of banking marketing using the most common technique of introducing new services for clients – the so-called cold calling. Therefore, the question of assessing the probability and predicting the behavior of a potential client when promoting new banking services and services using cold calling is particularly relevant. In the course of the study, libraries of machine learning methods and data analysis of the Python programming language were used. A program was developed to build a model for predicting the behavior of bank customers using data processing methods using gradient boosting, regularization of gradient boosting, random forest algorithm and recurrent neural networks. Analogous models were built using cloud machine learning services Azure ML, BigML and the Auto-sklearn library. Data analysis and prediction models built using Python language libraries have a fairly high quality – an average of 94.5 %. Using the Azure ML cloud service, a predictive model with an accuracy of 88.6 % was built. The BigML machine learning service made it possible to build a model with an accuracy of 88.8 %. Machine learning methods from the Auto-sklearn library made it possible to obtain a model with a higher quality – 94.9 %. This is due to the fact that the proposed libraries of the Python programming language allow better customization of data processing methods and machine learning to obtain more accurate models than free cloud services that do not provide such capabilities. Thanks to this, it is possible to obtain a predictive model of the behavior of bank customers with a fairly high degree of accuracy. It is worth noting that in order to make a prediction (forecast), it is necessary to study the context of the task, process the data, build various machine learning algorithms, evaluate the quality of the models and choose the best of them. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
48. The Advantage of Case-Tailored Information Metrics for the Development of Predictive Models, Calculated Profit in Credit Scoring.
- Author
-
Chrościcki, Daniel and Chlebus, Marcin
- Subjects
- *
CREDIT ratings , *CREDIT risk , *PREDICTION models , *DISEASE risk factors , *LOANS , *RECEIVER operating characteristic curves - Abstract
This paper compares model development strategies based on different performance metrics. The study was conducted in the area of credit risk modeling with the usage of diverse metrics, including general-purpose Area Under the ROC curve (AUC), problem-dedicated Expected Maximum Profit (EMP) and the novel case-tailored Calculated Profit (CP). The metrics were used to optimize competitive credit risk scoring models based on two predictive algorithms that are widely used in the financial industry: Logistic Regression and extreme gradient boosting machine (XGBoost). A dataset provided by the American Fannie Mae agency was utilized to conduct the study. In addition to the baseline study, the paper also includes a stability analysis. In each case examined the proposed CP metric that allowed us to achieve the most profitable loan portfolio. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
49. INCREMENTAL LEARNING METHOD FOR DATA WITH DELAYED LABELS.
- Author
-
Haoran GAO, Zhijun DING, and Meiqin PAN
- Subjects
CREDIT ratings ,MACHINE learning ,ALGORITHMS - Abstract
Most research on machine learning tasks relies on the availability of true labels immediately after making a prediction. However, in many cases, the ground truth labels become available with a non-negligible delay. In general, delayed labels create two problems. First, labelled data is insufficient because the label for each data chunk will be obtained multiple times. Second, there remains a problem of concept drift due to the long period of data. In this work, we propose a novel incremental ensemble learning when delayed labels occur. First, we build a sliding time window to preserve the historical data. Then we train an adaptive classifier by labelled data in the sliding time window. It is worth noting that we improve the TrAdaBoost to expand the data of the latest moment when building an adaptive classifier. It can correctly distinguish the wrong types of source domain sample classification. Finally, we integrate the various classifiers to make predictions. We apply our algorithms to synthetic and real credit scoring datasets. The experiment results indicate our algorithms have superiority in delayed labelling setting. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
50. Machine Learning Models and Data-Balancing Techniques for Credit Scoring: What Is the Best Combination?
- Author
-
Hussin Adam Khatir, Ahmed Almustfa and Bee, Marco
- Subjects
MACHINE learning ,CREDIT ratings ,FEATURE selection ,CREDIT analysis ,RANDOM forest algorithms ,STATISTICAL learning - Abstract
Forecasting the creditworthiness of customers is a central issue of banking activity. This task requires the analysis of large datasets with many variables, for which machine learning algorithms and feature selection techniques are a crucial tool. Moreover, the percentages of "good" and "bad" customers are typically imbalanced such that over- and undersampling techniques should be employed. In the literature, most investigations tackle these three issues individually. Since there is little evidence about their joint performance, in this paper, we try to fill this gap. We use five machine learning classifiers, and each of them is combined with different feature selection techniques and various data-balancing approaches. According to the empirical analysis of a retail credit bank dataset, we find that the best combination is given by random forests, random forest recursive feature elimination and random oversampling. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.