48 results on '"Credit scoring"'
Search Results
2. Multiple optimized ensemble learning for high-dimensional imbalanced credit scoring datasets.
- Author
-
Lenka, Sudhansu R., Bisoy, Sukant Kishoro, and Priyadarshini, Rojalina
- Subjects
CREDIT risk ,RANDOM forest algorithms ,FEATURE selection ,SUBSET selection ,RESEARCH personnel - Abstract
Credit scoring models are crucial tools for lenders to assess credit risks. Researchers from academia and the financial industry have shown intense interest in these models. However, real credit datasets often have high dimensionality and class imbalance, making it challenging to develop accurate and effective credit scoring models. To address these challenges, a new approach called the Multiple-Optimized Ensemble Learning (MOEL) method has been proposed. In MOEL, a technique called Multiple Diverse Optimized Subsets (MDOS) generates multiple diverse optimized subsets from various weighted random forests. From each subset, more effective and relevant features are selected. Then, a new evaluation measure is applied to each subset to determine the more optimized subsets. These subsets are applied to a novel Mahalanobis-based oversampling (MOS) technique to provide balanced subsets for the base classifier, which lessens the detrimental effects of imbalanced datasets. Finally, a stacking-based ensemble method is applied to the balanced subsets for integration of the base models. The proposed model was evaluated against six high-dimensional imbalanced credit scoring datasets, and it outperformed state-of-the-art methods, exhibiting a mean rank of 1.5 and 1.333 in terms of F1_score and G-mean, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Range control-based class imbalance and optimized granular elastic net regression feature selection for credit risk assessment.
- Author
-
Amarnadh, Vadipina and Moparthi, Nageswara Rao
- Subjects
CREDIT analysis ,CREDIT risk ,FEATURE selection ,BORROWING capacity ,RISK assessment - Abstract
Credit risk, stemming from the failure of a contractual party, is a significant variable in financial institutions. Assessing credit risk involves evaluating the creditworthiness of individuals, businesses, or entities to predict the likelihood of defaulting on financial obligations. While financial institutions categorize consumers based on creditworthiness, there is no universally defined set of attributes or indices. This research proposes Range control-based class imbalance and Optimized Granular Elastic Net regression (ROGENet) for feature selection in credit risk assessment. The dataset exhibits severe class imbalance, addressed using Range-Controlled Synthetic Minority Oversampling TEchnique (RCSMOTE). The balanced data undergo Granular Elastic Net regression with hybrid Gazelle sand cat Swarm Optimization (GENGSO) for feature selection. Elastic net, ensuring sparsity and grouping for correlated features, proves beneficial for assessing credit risk. ROGENet provides a detailed perspective on credit risk evaluation, surpassing conventional methods. The oversampling feature selection enhances the accuracy of minority class by 99.4, 99, 98.6 and 97.3%, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. The good, the bad and the tenant: Rental platforms renewing racial capitalism in the post-apartheid housing market.
- Author
-
Migozzi, Julien
- Abstract
This article examines how racial capitalism intersects with platform capitalism through the rise of rental platforms and corporate landlords in the post-apartheid housing market. Combining 18 months of fieldwork in Cape Town with the spatial analysis of sales and longitudinal census data, I demonstrate how rental platforms enabled the consolidation of the private rental sector and the emergence of corporate landlords through the classification of tenants centered upon credit scoring. To automate tenant screening solutions, rental platforms leveraged and extended the information dragnet knitted by credit bureaus. This dragnet of unprecedented depth and volume is built upon the infrastructures and devices that enabled the for-profit, racial classification of people, housing and neighborhoods during colonialism and apartheid, notably ID numbers. In the context of racialized indebtedness and housing inequalities engineered by racial property regimes, the use of platforms to sort the "good" from the "bad" tenant and manage rental portfolios shifts mechanisms of segregation and reproduces racialized patterns of capital accumulation across the post-apartheid city. The article argues that rental platforms extend the extractive logic of racial capitalism through two joint rentier mechanisms: the transformation of rental housing into a new asset class; the extraction and assetization of rental data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Method for Classifying Economic Agents Based on Neural Networks and Fuzzy Logic
- Author
-
Neskorodieva, Tetiana, Fedorov, Eugene, Nechyporenko, Olga, Neskorodieva, Anastasiia, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Kazymyr, Volodymyr, editor, Morozov, Anatoliy, editor, Palagin, Alexander, editor, Shkarlet, Serhiy, editor, Stoianov, Nikolai, editor, Vinnikov, Dmitri, editor, and Zheleznyak, Mark, editor
- Published
- 2024
- Full Text
- View/download PDF
6. New Paradigm in Financial Technology Using Machine Learning Techniques and Their Applications
- Author
-
Patnaik, Deepti, Patnaik, Srikanta, Kacprzyk, Janusz, Series Editor, Jain, Lakhmi C., Series Editor, Maglaras, Leandros A., editor, Das, Sonali, editor, Tripathy, Naliniprava, editor, and Patnaik, Srikanta, editor
- Published
- 2024
- Full Text
- View/download PDF
7. How Can Credit Scoring Benefit from Machine Learning? SWOT Analysis
- Author
-
Bentounsi, Oussama, Lahmini, Hajar Mouatassim, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Abraham, Ajith, editor, Bajaj, Anu, editor, Hanne, Thomas, editor, and Siarry, Patrick, editor
- Published
- 2024
- Full Text
- View/download PDF
8. Applications of Predictive Models in FinTech
- Author
-
Arnone, Gioia and Arnone, Gioia
- Published
- 2024
- Full Text
- View/download PDF
9. Modeling Automobile Credit Scoring Using Machine Learning Models
- Author
-
Yiğit, Pakize, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, García Márquez, Fausto Pedro, editor, Jamil, Akhtar, editor, Hameed, Alaa Ali, editor, and Segovia Ramírez, Isaac, editor
- Published
- 2024
- Full Text
- View/download PDF
10. Deep Learning and Machine Learning Techniques for Credit Scoring: A Review
- Author
-
Demma Wube, Hana, Zekarias Esubalew, Sintayehu, Fayiso Weldesellasie, Firesew, Girma Debelee, Taye, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Debelee, Taye Girma, editor, Ibenthal, Achim, editor, Schwenker, Friedhelm, editor, and Megersa Ayano, Yehualashet, editor
- Published
- 2024
- Full Text
- View/download PDF
11. A Synthesis on Machine Learning for Credit Scoring: A Technical Guide
- Author
-
Akil, Siham, Sekkate, Sara, Adib, Abdellah, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Ben Ahmed, Mohamed, editor, Boudhir, Anouar Abdelhakim, editor, El Meouche, Rani, editor, and Karaș, İsmail Rakıp, editor
- Published
- 2024
- Full Text
- View/download PDF
12. Machine Learning in Finance Case of Credit Scoring
- Author
-
El Maanaoui, Driss, Jeaab, Khalid, Najmi, Hajare, Saoudi, Youness, Falloul, Moulay El Mehdi, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Farhaoui, Yousef, editor, Hussain, Amir, editor, Saba, Tanzila, editor, Taherdoost, Hamed, editor, and Verma, Anshul, editor
- Published
- 2024
- Full Text
- View/download PDF
13. Credit Risk Management in Microfinance: Application of Non-repayment Prediction Models
- Author
-
Nejjar, Chaymae, Kaicer, Mohammed, Haimer, Sara El, Idhmad, Azzeddine, Essairh, Loubna, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Ezziyyani, Mostafa, editor, and Balas, Valentina Emilia, editor
- Published
- 2024
- Full Text
- View/download PDF
14. Credit Risk Scoring: A Stacking Generalization Approach
- Author
-
Raimundo, Bernardo, Bravo, Jorge M., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Rocha, Alvaro, editor, Adeli, Hojjat, editor, Dzemyda, Gintautas, editor, Moreira, Fernando, editor, and Colla, Valentina, editor
- Published
- 2024
- Full Text
- View/download PDF
15. The Rise of AI and ML in Financial Technology: An In-depth Study of Trends and Challenges
- Author
-
Jain, Rahul, Vanzara, Rakesh, Sarvakar, Ketan, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Tan, Kay Chen, Series Editor, Kumar, Amit, editor, and Mozar, Stefan, editor
- Published
- 2024
- Full Text
- View/download PDF
16. Incremental Machine Learning-Based Approach for Credit Scoring in the Age of Big Data
- Author
-
Museba, Tinofirei, Moloi, Tankiso, editor, and George, Babu, editor
- Published
- 2024
- Full Text
- View/download PDF
17. Il credit scoring e la protezione dei dati personali: commento alle sentenze della Corte di giustizia dell’Unione europea del 7 dicembre 2023
- Author
-
Valeria Pietrella and Stefania Racioppi
- Subjects
corte di giustizia dell’unione europea ,credit scoring ,protezione dei dati personali ,sindacato giurisdizionale ,trattamento automatizzato ,Law ,Cybernetics ,Q300-390 - Abstract
L’articolo analizza le sentenze della Corte di giustizia dell’Unione europea del 7 dicembre 2023 (cause riunite C-26/22 e C-64/22 e C-634/21). Le decisioni offrono infatti l’opportunità di riflettere su due temi principali: l’ampiezza del sindacato giurisdizionale esercitato su una decisione di reclamo adottata da un’autorità di controllo e la liceità della raccolta e del trattamento, anche automatizzato, dei dati personali. Le forme di tutela connesse al trattamento dei dati personali, sviluppandosi in più livelli e con modalità differenti, delineano nel loro complesso un sistema volto a garantire la massima protezione dei dati, che, anche in un’ottica di bilanciamento degli interessi, appare – almeno alla luce delle sentenze in esame – prevalere sugli interessi commerciali connessi all’utilizzo degli stessi.
- Published
- 2024
- Full Text
- View/download PDF
18. Credit scoring using machine learning and deep Learning-Based models
- Author
-
Sami Mestiri
- Subjects
credit scoring ,machine learning ,artificial intelligence ,model comparison ,personal loan ,Finance ,HG1-9999 ,Statistics ,HA1-4737 - Abstract
Credit scoring is a useful tool for assessing the capability of customers repayments. The purpose of this paper is to compare the predictive abilities of six credit scoring models: Linear Discriminant Analysis (LDA), Random Forests (RF), Logistic Regression (LR), Decision Trees (DT), Support Vector Machines (SVM) and Deep Neural Network (DNN). To compare these models, an empirical study was conducted using a sample of 688 observations and twelve variables. The performance of this model was analyzed using three measures: Accuracy rate, F1 score, and Area Under Curve (AUC). In summary, machine learning techniques exhibited greater accuracy in predicting loan defaults compared to other traditional statistical models.
- Published
- 2024
- Full Text
- View/download PDF
19. A Novel Modified Binning and Logistics Regression to Handle Shifting in Credit Scoring.
- Author
-
Anggodo, Yusuf Priyo and Girsang, Abba Suganda
- Subjects
CREDIT risk ,VENTURE capital ,FINANCIAL risk ,FINANCIAL technology ,EXECUTIVES ,PERSONAL loans ,EMERGING markets ,COMMERCIAL loans - Abstract
The development of financial technology (Fintech) in emerging economies such as Indonesia has been rapid in the last few years, opening a great potential for loan businesses, from venture capital to micro and personal loans. To survive in such competitive markets, new companies need a robust credit-scoring model. However, building a reliable model requires large stable data. The challenge is that datasets are often small, covering only a few months (short-period datasets). Therefore, this study proposes a modified binning method, namely changing a variable's values into two groups with the smallest distribution differences possible. Modified binning can maintain data trends to avoid future shifting. The simulation was conducted using a real dataset from Indonesian Fintech, comprising 44,917 borrower-level observations with 396 variables. To match the actual conditions, the first three months of data were allocated for modeling and the remaining for testing. Implementing modified binning and logistics regression to testing data results in a more stable score band than standard binning. Compared with other classifier methods, the proposed method obtained the best AUC results on the testing data (0.73). In addition, the proposed method is highly applicable as it can provide a straightforward explanation to upper management or regulators. It is practical to use in real-case financial technology with short-period problems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. The shape of an ROC curve in the evaluation of credit scoring models.
- Author
-
Kochański, Błażej
- Subjects
RECEIVER operating characteristic curves ,CREDIT scoring systems ,PARAMETERS (Statistics) ,ECONOMIC activity ,CREDIT analysis - Abstract
The AUC, i.e. the area under the receiver operating characteristic (ROC) curve, or its scaled version, the Gini coefficient, are the standard measures of the discriminatory power of credit scoring. Using binormal ROC curve models, we show how the shape of the curves affects the economic benefits of using scoring models with the same AUC. Based on the results, we propose that the shape parameter of the fitted ROC curve is reported alongside its AUC/Gini whenever the quality of a scorecard is discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. An Age–Period–Cohort Framework for Profit and Profit Volatility Modeling.
- Author
-
Breeden, Joseph L.
- Subjects
- *
CREDIT risk , *DATA structures , *DISEASE risk factors , *PANEL analysis , *PROFITABILITY - Abstract
The greatest source of failure in portfolio analytics is not individual models that perform poorly, but rather an inability to integrate models quantitatively across management functions. The separable components of age–period–cohort models provide a framework for integrated credit risk modeling across an organization. Using a panel data structure, credit risk scores can be integrated with an APC framework using either logistic regression or machine learning. Such APC scores for default, payoff, and other key rates fit naturally into forward-looking cash flow estimates. Given an economic scenario, every applicant at the time of origination can be assigned profit and profit volatility estimates so that underwriting can truly be account-level. This process optimizes the most fallible part of underwriting, which is setting cutoff scores and assigning loan pricing and terms. This article provides a summary of applications of APC models across portfolio management roles, with a description of how to create the models to be directly integrated. As a consequence, cash flow calculations are available for each account, and cutoff scores can be set directly from portfolio financial targets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. REVISITING DISTANCE METRICS IN k-NEAREST NEIGHBORS ALGORITHMS Implications for Sovereign Country Credit Rating Assessments.
- Author
-
CETIN, Ali Ihsan and BUYUKLU, Ali Hakan
- Subjects
- *
RATINGS & rankings of public debts , *K-nearest neighbor classification , *CREDIT analysis , *CLASSIFICATION algorithms , *EUCLIDEAN metric , *EUCLIDEAN distance , *EUCLIDEAN algorithm - Abstract
The k-nearest neighbors (k-NN) algorithm, a fundamental machine learning technique, typically employs the Euclidean distance metric for proximity-based data classification. This research focuses on the feature importance infused k-NN model, an advanced form of k-NN. Diverging from traditional algorithm uniform weighted Euclidean distance, feature importance infused k-NN introduces a specialized distance weighting system. This system emphasizes critical features while reducing the impact of lesser ones, thereby enhancing classification accuracy. Empirical studies indicate a 1.7% average accuracy improvement with proposed model over conventional model, attributed to its effective handling of feature importance in distance calculations. Notably, a significant positive correlation was observed between the disparity in feature importance levels and the model's accuracy, highlighting proposed model's proficiency in handling variables with limited explanatory power. These findings suggest proposed model's potential and open avenues for future research, particularly in refining its feature importance weighting mechanism, broadening dataset applicability, and examining its compatibility with different distance metrics. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Personal factors as determinants of the risk rating for SME investment.
- Author
-
Jurado, Antonio, Sánchez-Oro Sánchez, Marcelo, Robina-Ramirez, Rafael, and Jimenez-Naranjo, Hector V.
- Subjects
INDUSTRIAL management ,BUSINESSPEOPLE ,CREDIT risk ,SMALL business ,CREDIT scoring systems ,LOANS ,STRUCTURAL equation modeling - Abstract
What variables indicate whether a small or medium enterprise (SME) applying for financing from a development bank is a good loan candidate? Structural equation modeling applied to a set of variables can provide the necessary insights. This paper analyzes 407 SMEs that applied for development loans in Pichincha Province, Ecuador. Rather than specific economic/financial quantitative variables often used in credit scoring models, we employed qualitative variables to study creditworthiness. The structural equation methodology can verify whether a client is creditworthy based on company management, product characteristics, and contextual market aspects. Another key contribution is the finding that an entrepreneur's personal/professional traits are the primary determinant for granting this type of loan. The results have theoretical and practical implications that could enhance the limited empirical research in this field to date. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Credit Rating of Listed Companies Based on Financial Reporting Information in Banks
- Author
-
Mohammad Jahangirian, Farzin Rezayi, and Reza Ehteshamrasi
- Subjects
credit scoring ,banks ,financial reporting information ,Business ,HF5001-6182 ,Accounting. Bookkeeping ,HF5601-5689 - Abstract
Accurate and condition-based validation will increase the value of banks, and if this is not done well, banks will face the risk of bankruptcy. The purpose of this research is to evaluate the usefulness of financial reporting information in modeling the credit rating of bank customers in listed companies. The methods used to validate the model in listed companies were grounded theory, DEMATEL, and regression. In the first stage, four main axes including (1) financial criteria, (2) non-financial criteria, (3) corporate governance criteria, and (4) market criteria for credit rating of customers were clarified by asking the opinion of 10 experts, were explained for the credit rating of customers. Three hypotheses were identified in the DEMATEL phase: the first hypothesis: There is a significant positive relationship between financial factors and bank facilities received by companies admitted to the Tehran Stock Exchange. Second hypothesis: There is a significant positive relationship between the company's market information and the bank facilities received by the companies admitted to the Tehran Stock Exchange. Third hypothesis: There is a significant positive relationship between non-financial factors and bank facilities received by companies listed on the Tehran Stock Exchange. The findings of the regression analysis did not reject the above hypotheses in 64 companies during the 5 years from 1396 to 1400. Banks should pay attention to the characteristics of companies in the existing conditions when providing facilities
- Published
- 2024
- Full Text
- View/download PDF
25. Counterfactual Explanations With Multiple Properties in Credit Scoring
- Author
-
Xolani Dastile and Turgay Celik
- Subjects
Counterfactual explanations ,credit scoring ,optimization ,eXplainable Artificial Intelligence (XAI) ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
EXplainable Artificial Intelligence (XAI) aims to reveal the reasons behind predictions from non-transparent classifiers. Explanations of automated decisions are important in critical domains such as finance, legal, and health. As a result, researchers and practitioners in recent years have actively worked on developing techniques that explain decisions from machine learning algorithms. For instance, an explanation technique called counterfactual explanation has recently been gaining traction in XAI. The interest in counterfactual explanations stems from the ability of the explanations to reveal what could have been different to achieve a desired outcome, as opposed to only highlighting important features. For instance, if a customer’s loan application is denied by the bank, a counterfactual will indicate the changes required for the customer to qualify for the loan in the future. For a counterfactual to be considered effective, several counterfactual properties must hold. This paper proposes a novel optimization formulation designed to generate counterfactual explanations that possess multiple properties concurrently. The efficacy of the proposed method is assessed on a publicly available credit dataset. The results showed a trade-off between validity and sparsity, which are both parts of a suite of counterfactual properties. Furthermore, the results showed that our proposed approach compromises validity to some degree but strikes a good balance between validity and sparsity.
- Published
- 2024
- Full Text
- View/download PDF
26. An Efficient and Scalable Byzantine Fault-Tolerant Consensus Mechanism Based on Credit Scoring and Aggregated Signatures
- Author
-
Shihua Tong, Jibing Li, and Wei Fu
- Subjects
Blockchain ,consensus mechanisms ,PBFT ,credit scoring ,aggregated signatures ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Practical Byzantine Fault Tolerance (PBFT), a classic consensus algorithm in blockchain technology, is extensively used in consortium blockchain networks. However, it is challenged by issues such as low consensus efficiency, poor scalability, inability to guarantee throughput with large-scale node access, and complex communication processes. To solve these problems, this paper proposes an improved PBFT consensus mechanism based on credit scoring and aggregated signatures, i.e., the CA-PBFT algorithm. First, the algorithm designs the node credit scoring mechanism, adds the coordination node in the original algorithm model, stipulates the node state and functional limitations, and realizes the dynamic joining and exiting of the nodes, to solve the low efficiency of the PBFT algorithm during the consensus process and the problem of not supporting the dynamic joining and exiting of the nodes; at the same time, the signature scheme based on the BLS aggregated signature is designed, which reduces the length of the signature and simplifies the signing process, to solve the problem of the node’s signature taking up too much space during the consensus process, which affects the efficiency of the signature validation as well as the efficiency of the signature construction. Experimental results show that this consensus mechanism enables an efficient, secure, and scalable consensus process with low resource and computational costs.
- Published
- 2024
- Full Text
- View/download PDF
27. OptimizingEnsemble Learning to Reduce Misclassification Costs in Credit Risk Scorecards.
- Author
-
Martin, John, Taheri, Sona, and Abdollahian, Mali
- Subjects
- *
CREDIT risk , *MACHINE learning , *FINANCIAL institutions , *K-nearest neighbor classification , *COST - Abstract
Credit risk scorecard models are utilized by lending institutions to optimize decisions on credit approvals. In recent years, ensemble learning has often been deployed to reduce misclassification costs in credit risk scorecards. In this paper, we compared the risk estimation of 26 widely used machine learning algorithms based on commonly used statistical metrics. The best-performing algorithms were then used for model selection in ensemble learning. For the first time, we proposed financial criteria that assess the impact of losses associated with both false positive and false negative predictions to identify optimal ensemble learning. The German Credit Dataset (GCD) is augmented with simulated financial information according to a hypothetical mortgage portfolio observed in UK, European and Australian banks to enable the assessment of losses arising from misclassification costs. The experimental results using the simulated GCD show that the best predictive individual algorithm with the accuracy of 0.87, Gini of 0.88 and Area Under the Receiver Operating Curve of 0.94 was the Generalized Additive Model (GAM). The ensemble learning method with the lowest misclassification cost was the combination of Random Forest (RF) and K-Nearest Neighbors (KNN), totaling USD 417 million in costs (USD 230 for default costs and USD 187 for opportunity costs) compared to the costs of the GAM (USD 487, USD 287 and USD 200). Implementing the proposed financial criteria has led to a significant USD 70 million reduction in misclassification costs derived from a small sample. Thus, the lending institutions' profit would considerably rise as the number of submitted credit applications for approval increases. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. The hazards of delivering a public loan guarantee scheme: An analysis of borrower and lender characteristics.
- Author
-
Cowling, Marc, Wilson, Nick, Nightingale, Paul, and Kacer, Marek
- Subjects
MONEYLENDERS ,FINANCIAL institutions ,LOANS ,BANKING industry ,CORPORATIONS ,SMALL business ,SURETYSHIP & guaranty - Abstract
Using data between 2009 and 2020, we provide a detailed description of the borrowers within the Enterprise Finance Guarantee (EFG) loan portfolio, analyse time to default and how it differs across lender types. For limited companies, we match additional financial and non-financial data from public and proprietary databases and profile the characteristics of EFG companies within the population of limited companies. Employing hazard models we find loans granted to unincorporated businesses by the medium-sized financial institutions are associated with a much lower hazard than those provided by smaller local lending institutions and not-for-profit agencies. Moreover, we find some evidence that loans to limited companies, issued by the big UK banking groups, have a significantly lower default than those from medium-sized financial institutions. Large banks screen out high-risk firms. We argue that smaller lenders are able to price the risks rejected by the larger banks, using a wider range of credit information. JEL codes: G01, G21, L52, D25 [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Implementing and analyzing fairness in banking credit scoring.
- Author
-
Mariscal, Charlene, Yustiawan, Yoga, Rochim, Fauzy Caesar, and Tanuar, Evawaty
- Subjects
MACHINE learning ,BANK loans ,FAIRNESS ,CONSCIOUSNESS raising - Abstract
The decision made by machine learning is mostly based on historical data that is used to train them. It raises the awareness that discrimination in machine learning should be eliminated because it may contain societal bias. The financial industry uses credit scoring as a reference to reflect the customer risk profile. To achieve fairness in the model, this paper tries to: (1) assess bias and (2) improve fairness in machine learning models with three bias mitigation methodologies. This study depicts that there is a trade-off between improving fairness and preserving performance. Implementing post-processing methods, for example, Grid Search performs best. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. A credit scoring ensemble model incorporating fuzzy clustering particle swarm optimization algorithm.
- Author
-
Qin, Xiwen, Ji, Xing, Zhang, Siqi, and Xu, Dingxin
- Subjects
- *
PARTICLE swarm optimization , *OUTLIER detection , *CLUSTERING of particles , *FUZZY clustering technique , *CONSUMER behavior , *CREDIT risk , *CONSUMER lending , *RANDOM forest algorithms , *FUZZY algorithms - Abstract
The emergence of credit has generated a wealth of data on consumer lending behavior. In recent years, financial institutions have also started to use such data to make informed lending decisions based on fine-grained customer data, but conventional risk assessment models are inadequate in meeting the risk control requirements of the financial industry. Therefore, this paper proposes a credit scoring ensemble model incorporating fuzzy clustering particle swarm optimization (PSO) algorithm to obtain better credit risk prediction capability. First, a weighted outlier detection method based on the Induced Ordered Weighted Average Operator is proposed to preprocess the data to reduce noisy data's misleading effect on model training. Then, an undersampling method combined with fuzzy clustering PSO is proposed to overcome the negative effect of category imbalance on model training by resampling the data. In addition, a hyperparameter optimization framework is introduced to adaptively adjust important parameters in the ensemble model considering the impact of parameter settings on the training performance of the model. Based on the evaluation metrics of F-score, AUC, and Kappa coefficient, an empirical analysis was conducted on five credit risk datasets. The results show that the proposed method outperforms the comparative model with an improvement of 10% to 50% in terms of F-score and AUC. The highest achieved F-score is 0.9488, and the maximum AUC is 0.9807, demonstrating the effectiveness of the proposed method. The kappa coefficient results indicate a high level of consistency in the predicted classification results of the model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Quantum Optimized Cost Based Feature Selection and Credit Scoring for Mobile Micro-financing.
- Author
-
Chen, Chi Ming, Tso, Geoffrey Kwok Fai, and He, Kaijian
- Subjects
CREDIT risk ,FEATURE selection ,CREDIT scoring systems ,WRAPPERS ,EVOLUTIONARY algorithms ,LOANS ,QUANTUM gates ,BUDGET - Abstract
Mobile e-commerce has grown rapidly in the last decade because of the development of mobile network services, computing capabilities and big data's applications. Financial institutions have been undergoing fundamental transformation in credit risk areas, specifically to traditional credit policy, that is now inadequate for accurately evaluating an individual's credit risk profile in a timely manner. A big-scale dataset representing deep mobile usage of 450,722 anonymous mobile users with a 28-month loan history and mobile behavior of both iOS and Android is designed, can add value for credit scoring in terms of better accuracy and lower feature acquisition cost by introducing a cost-based quantum-inspired evolutionary algorithm (QIEA) feature selection method. The QIEA adopts quantum-based individual representation and quantum rotation gate operator to improve feature exploration capability of conventional genetic algorithm (GA). The expected feature yield fitness function introduced in QIEA able to identify cost-effective feature subsets. Experimental results show that quantum-based method achieves good predictive performances even with only 70–80% number of features selected by GAs, and hence achieve lower feature acquisition costs with budget constraints. Additionally, computational time can be reduced by 30–60% compared with GAs depending on different feature set sizes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. A New Discrete Learning-Based Logistic Regression Classifier for Bankruptcy Prediction.
- Author
-
Khashei, Mehdi, Etemadi, Sepideh, and Bakhtiarvand, Negar
- Subjects
COST functions ,PROCESS capability ,BANKRUPTCY ,INVESTORS ,LOGISTIC regression analysis ,BOND market - Abstract
Credit scoring or predicting bankruptcy is among the most crucial techniques for identifying high-risk and low-risk credit situations. Accordingly, enhancing the accuracy of bankruptcy prediction methods decreases the risk of inappropriate financial decisions. Also, increasing the accuracy of credit scoring models brings significant benefits such as improved turnover, credit market growth, proper and efficient allocation of financial resources, and sustained improvement of the profits of banks, investors, funds, and governments. Various statistical classification methods have been developed in the literature with different features and characteristics for more accurate bankruptcy prediction. However, despite all appearance differences in statistical classification approaches, they all adhere to a common idea and concept in their training procedures. The basic operation logic in whole-developed statistical classification methods focuses on maximizing a continuous distance-based cost function to yield the highest performance. Despite it being a common and frequently used procedure for classification purposes, it is an unreasonable and inefficient manner to achieve maximum accuracy in a discrete classification field. In this paper, a new discrete direction-based Logistic Regression that is a common statistical classifier method for bankruptcy forecasting is proposed. In the proposed Logistic Regression, in contrast to all traditionally developed statistical classifiers, the compatibility of the cost function and the training procedure is considered. While it can be shown overall that the performance of the presented discrete direction-based classifier will not be inferior to its continuous counterpart, an evaluation of the suggested classifier is conducted to ascertain its superiority. For this purpose, three credit scoring datasets are considered to assess the classification rate of the presented classifier. Empirical outcomes demonstrate that, as pre-expected, in all cases, the model put forward can attain a superior performance compared to conventional alternatives. These findings clearly demonstrated the significant influence of the consistency between the cost function and the training process on the classification capability, a consideration absent in any of the traditional statistical classification procedures. Consequently, the presented Logistic Regression can be considered an efficient alternative for credit scoring purposes to achieve more accurate results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Credit scoring and risk management in islamic banking: the case of Al Etihad Credit Bureau.
- Author
-
Alhammadi, Mohamed Abdulraheem Ahmed, Ibañez-Fernandez, Alberto, and Vergara-Romero, Arnaldo
- Abstract
Copyright of Revista Venezolana de Gerencia (RVG) is the property of Revista de Filosofia-Universidad del Zulia and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
34. CREDIT SCORING COMO TRATAMIENTO DE DATOS PERSONALES A LA LUZ DEL RGPD. ANÁLISIS DE SU FINALIDAD E INFLUENCIA EN LOS POSIBLES USOS SECUNDARIOS DE LOS DATOS.
- Author
-
Campos Rivera, Gonzalo
- Subjects
GENERAL Data Protection Regulation, 2016 ,CONSUMER credit ,BANKING industry ,BORROWING capacity ,LOANS ,PERSONALLY identifiable information - Abstract
Copyright of Revista de Derecho UNED is the property of Editorial UNED and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
35. CJEU: The Rating of a Natural Person's Creditworthiness by a Credit Rating Agency Constitutes Profiling and Can Be an Automated Decision under Article 22 GDPR.
- Author
-
Horstmann, Jan
- Subjects
GENERAL Data Protection Regulation, 2016 ,PERSONALLY identifiable information ,ONLINE profiling ,DATA protection - Abstract
Case C-634/21 OQ v Land Hessen (Scoring), Judgment of the Court of Justice of the European Union (First Chamber) of 7 December 2023 Article 22 (1) of Regulation (EU) 2016/679 (General Data Protection Regulation) must be interpreted as meaning that the automated establishment, by a credit information agency, of a probability value based on personal data relating to a person and concerning his or her ability to meet payment commitments in the future constitutes 'automated individual decision-making' within the meaning of that provision, where a third party, to which that probability value is transmitted, draws strongly on that probability value to establish, implement or terminate a contractual relationship with that person. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Application and optimization of deep learning in the credit score of auto finance
- Author
-
Wang Zhan
- Subjects
brnn ,logistic regression ,extreme gradient boosting tree ,deep learning ,credit scoring ,68m11 ,Mathematics ,QA1-939 - Abstract
In this paper, a credit scoring integration model incorporating BRNN is used to study the credit scoring problem in automobile finance. Aiming at the problems of existing credit scoring models constructed with shallow architecture and the unidirectional limitation of RNN itself, this paper introduces a BRNN model that superimposes RNN models in two directions. The potential relationship between each credit feature is mined through logistic regression, extreme gradient boosting tree, and bidirectional recurrent neural network algorithms, and the final prediction output is linked to the customer’s overall credit to improve the prediction accuracy. In this paper, we study the application of a credit scoring model based on the improved BRNN model for an auto finance company. Data preprocessing techniques and feature screening methods are used to improve the BRNN model and construct the credit scoring model for auto finance at Company A. The BRNN model is the basis for Company A’s credit scoring model. Based on the comparison with other models, it is concluded that the automobile finance credit scoring IBRNN model constructed based on the improved BRNN model in this paper has an accuracy of 89.6% in classifying the user finance data of Company A on different datasets, which is a significant improvement compared with the other five models.
- Published
- 2024
- Full Text
- View/download PDF
37. An Age–Period–Cohort Framework for Profit and Profit Volatility Modeling
- Author
-
Joseph L. Breeden
- Subjects
credit scoring ,survival models ,age–period–cohort ,profitability ,Mathematics ,QA1-939 - Abstract
The greatest source of failure in portfolio analytics is not individual models that perform poorly, but rather an inability to integrate models quantitatively across management functions. The separable components of age–period–cohort models provide a framework for integrated credit risk modeling across an organization. Using a panel data structure, credit risk scores can be integrated with an APC framework using either logistic regression or machine learning. Such APC scores for default, payoff, and other key rates fit naturally into forward-looking cash flow estimates. Given an economic scenario, every applicant at the time of origination can be assigned profit and profit volatility estimates so that underwriting can truly be account-level. This process optimizes the most fallible part of underwriting, which is setting cutoff scores and assigning loan pricing and terms. This article provides a summary of applications of APC models across portfolio management roles, with a description of how to create the models to be directly integrated. As a consequence, cash flow calculations are available for each account, and cutoff scores can be set directly from portfolio financial targets.
- Published
- 2024
- Full Text
- View/download PDF
38. A novel deep learning approach to enhance creditworthiness evaluation and ethical lending practices in the economy
- Author
-
Qian, Xiaoyan, Cai, Helen Huifen, Innab, Nisreen, Wang, Danni, Ciano, Tiziana, and Ahmadian, Ali
- Published
- 2024
- Full Text
- View/download PDF
39. OptimizingEnsemble Learning to Reduce Misclassification Costs in Credit Risk Scorecards
- Author
-
John Martin, Sona Taheri, and Mali Abdollahian
- Subjects
credit scoring ,ensemble learning ,financial performance criteria ,statistical metrics ,Mathematics ,QA1-939 - Abstract
Credit risk scorecard models are utilized by lending institutions to optimize decisions on credit approvals. In recent years, ensemble learning has often been deployed to reduce misclassification costs in credit risk scorecards. In this paper, we compared the risk estimation of 26 widely used machine learning algorithms based on commonly used statistical metrics. The best-performing algorithms were then used for model selection in ensemble learning. For the first time, we proposed financial criteria that assess the impact of losses associated with both false positive and false negative predictions to identify optimal ensemble learning. The German Credit Dataset (GCD) is augmented with simulated financial information according to a hypothetical mortgage portfolio observed in UK, European and Australian banks to enable the assessment of losses arising from misclassification costs. The experimental results using the simulated GCD show that the best predictive individual algorithm with the accuracy of 0.87, Gini of 0.88 and Area Under the Receiver Operating Curve of 0.94 was the Generalized Additive Model (GAM). The ensemble learning method with the lowest misclassification cost was the combination of Random Forest (RF) and K-Nearest Neighbors (KNN), totaling USD 417 million in costs (USD 230 for default costs and USD 187 for opportunity costs) compared to the costs of the GAM (USD 487, USD 287 and USD 200). Implementing the proposed financial criteria has led to a significant USD 70 million reduction in misclassification costs derived from a small sample. Thus, the lending institutions’ profit would considerably rise as the number of submitted credit applications for approval increases.
- Published
- 2024
- Full Text
- View/download PDF
40. Who gets the money? A qualitative analysis of fintech lending and credit scoring through the adoption of AI and alternative data.
- Author
-
Tigges, Maximilian, Mestwerdt, Sönke, Tschirner, Sebastian, and Mauer, René
- Subjects
CREDIT scoring systems ,ARTIFICIAL intelligence ,FINANCIAL technology ,FINANCIAL markets ,DATA analysis - Abstract
Credit scoring plays an important role in determining the accessibility of credit in the financial sector. This in turn has a significant impact on how economic opportunities are distributed. Our study examines the use of AI and alternative data in fintech lending through the lens of Information Asymmetry Theory. By employing a qualitative research design using the Gioia method, we extract, analyze, and synthesize insights from a diverse group of 26 experts in fintech lending, artificial intelligence, machine learning, data science, and academia. Our results reveal several important findings: the enhancement of predictive proficiency and risk management, the decrease in default rates, the extension of credit access by including previously 'unbanked populations', the introduction of real-time creditworthiness assessment and new business models for entrepreneurs, the enhancement of credit market efficiencies and positive effects on the stability of financial markets. In addition, our study highlights the necessity for rigorous and critical ethical considerations of important challenges such as the question of consent, algorithmic transparency, data quality, data misuse, representativeness, traceability, responsibility, bias and discrimination. The reasonable goal of a more fair, resilient, sustainable and accessible credit system will require a joint effort to balance leveraging technological innovations with respecting peoples' right to privacy. • AI and alternative data enhance risk management during credit scoring processes. • AI and alternative data are key enablers for economic opportunities. • AI and alternative data help to give 'unbanked populations' access to credit. • Ethical concerns include bias, discrimination and data misuse. • Regulators must ensure explainability and traceability of AI decision-making. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. How do machine learning and non-traditional data affect credit scoring? New evidence from a Chinese fintech firm.
- Author
-
Gambacorta, Leonardo, Huang, Yiping, Qiu, Han, and Wang, Jingyi
- Abstract
This paper compares the predictive power of credit scoring models based on machine learning techniques with that of traditional loss and default models. Using proprietary transaction-level data from a leading fintech company in China, we test the performance of different models to predict losses and defaults both in normal times and when the economy is subject to a shock. In particular, we analyse the case of an (exogenous) change in regulation policy on shadow banking in China that caused credit conditions to deteriorate. We find that the model based on machine learning and non-traditional data is better able to predict losses and defaults than traditional models in the presence of a negative shock to the aggregate credit supply. This result reflects a higher capacity of non-traditional data to capture relevant borrower characteristics and of machine learning techniques to better mine the non-linear relationship between variables in a period of stress. • We compare the predictive power of machine learning and traditional credit models. • We analyse data from a Chinese fintech firm during normal and stress periods. • Machine learning models outperform other models, especially during negative shocks. • This shows their ability to detect non-linear patterns in stressful times. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. The impact of fintech lending on credit access for U.S. small businesses.
- Author
-
Cornelli, Giulio, Frost, Jon, Gambacorta, Leonardo, and Jagtiani, Julapa
- Abstract
Small business lending (SBL) plays an important role in funding productive investment and fostering local economic growth. Recently, nonbank lenders have gained market share in the SBL market in the United States, especially relative to community banks. Among nonbanks, fintech lenders have become particularly active, leveraging alternative data and complex modeling for their own internal credit scoring. We use proprietary loan-level data from two fintech SBL platforms (Funding Circle and LendingClub) to explore the characteristics of loans originated pre-pandemic (2016 2019). Our results show that these fintech SBL platforms lent relatively more in zip codes with higher unemployment rates and higher business bankruptcy filings. Moreover, fintech platforms' internal credit scores were able to predict future loan performance more accurately than traditional credit scores, particularly in areas with high unemployment. Using Y-14 M loan-level bank data, we compare fintech SBL with traditional bank business cards in terms of credit access and interest rates. Overall, while not all fintech firms follow the same approach, we find that fintech lenders could help close the credit gap, allowing small businesses that were less likely to receive credit through traditional lenders to access credit and potentially at lower cost. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. An explainable data-driven decision support framework for strategic customer development.
- Author
-
Onari, Mohsen Abbaspour, Rezaee, Mustafa Jahangoshai, Saberi, Morteza, and Nobile, Marco S.
- Abstract
Financial institutions benefit from the advanced predictive performance of machine learning algorithms in automatic decision-making for credit scoring. However, two main challenges hamper machine learning algorithms' applicability in practice: the complex and black-box nature of algorithms that hinder their understandability and the inability to guide rejected customers to have a successful application. Regarding customer relationship management is one of the main responsibilities of financial institutions; they must clarify the decision-making process to guide them. However, financial institutions are not willing to disclose their decision-making procedure to prevent potential risks from customers or competitors side. Hence, in this study, a decision support framework is proposed to clarify the decision-making process and model strategic decision-making to guide rejected customers simultaneously. To do so, after classifying customers in their corresponding groups, the capability of Shapley additive exPlanations method is exploited to extract the most impactful features to the prediction's outcome globally and locally. Then, based on the benchmarking approach, the equivalent approved peer is found for the rejected customer for target setting to modify the application. To find the optimal modified values for a counterfactual prediction, a multi-objective gamed-based counterfactual explanation model is developed using the prisoner's dilemma game as the constraint to simulate strategic decision-making. After optimization, the decision is reported to the customers concerning the credential background. A public data set is used to elaborate on the proposed framework. This framework can generate counterfactual predictions successfully by modifying perspective features. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. A Fast Survival Support Vector Regression Approach to Large Scale Credit Scoring via Safe Screening.
- Author
-
Wang H and Hong L
- Abstract
Survival models have found wider and wider applications in credit scoring recently due to their ability to estimate the dynamics of risk over time. In this research, we propose a Buckley-James safe sample screening support vector regression (BJS4VR) algorithm to model large-scale survival data by combing the Buckley-James transformation and support vector regression. Different from previous support vector regression survival models, censored samples here are imputed using a censoring unbiased Buckley-James estimator. Safe sample screening is then applied to discard samples that guaranteed to be non-active at the final optimal solution from the original data to improve efficiency. Experimental results on the large-scale real lending club loan data have shown that the proposed BJS4VR model outperforms existing popular survival models such as RSFM, CoxRidge and CoxBoost in terms of both prediction accuracy and time efficiency. Important variables highly correlated with credit risk are also identified with the proposed method.
- Published
- 2024
- Full Text
- View/download PDF
45. Variable selection of Kolmogorov-Smirnov maximization with a penalized surrogate loss.
- Author
-
Lin, Xiefang and Fang, Fang
- Subjects
- *
ASYMPTOTIC distribution , *COMMERCIAL statistics , *BAYES' estimation , *EMPIRICAL research - Abstract
Kolmogorov-Smirnov (KS) statistic is quite popular in many areas as the major performance evaluation criterion for binary classification due to its explicit business intension. Fang and Chen (2019) proposed a novel DMKS method that directly maximizes the KS statistic and compares favorably with the popular existing methods. However, DMKS did not consider the critical problem of variable selection since the special form of KS brings great challenge to establish the DMKS estimator's asymptotic distribution which is most likely to be nonstandard. This intractable issue is handled by introducing a surrogate loss function which leads to a n -consistent estimator for the true parameter up to a multiplicative scalar. Then a nonconcave penalty function is combined to achieve the variable selection consistency and asymptotical normality with the oracle property. Results of empirical studies confirm the theoretical results and show advantages of the proposed SKS (Surrogated Kolmogorov-Smirnov) method compared to the original DMKS method without variable selection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. Credit scoring: Does XGboost outperform logistic regression?A test on Italian SMEs.
- Author
-
Zedda, Stefano
- Abstract
The old-fashioned logistic regression is still the most used method for credit scoring. Recent developments have evolved new instruments coming from the machine learning approach, including random forests. In this paper, we tested the efficiency of logistic regression and XGBoost methods for default forecasting on a sample of 35,535 cases from 7 different business sectors of Italian SMEs, on a set of 28 banking variables and 55 balance sheet ratios for verifying which approach is better supporting the lending decisions. With this aim, we developed an efficiency index for measuring each model's capability to correctly select good borrowers, balancing the different effects of refusing the loan to a good customer and lending to a defaulter. Also, we computed the balancing spread to quantify the different models' efficiency in terms of credit costs for the borrower firms. Results show that different sectors report different results. However, generally speaking, the two methods report similar capabilities, while the cutoff setting can make a substantial difference in the actual use of those models for lending decisions. [Display omitted] • Logistic regression and XGBoost methods performances for default forecasting are compared: on a sample of 35,535 cases from 7 different business sectors of Italian SMEs. • The analysis is performed no a set of 28 banking variables and 55 balance sheet ratios for verifying which approach is mostly able to support the lending decisions. • The expected gross income of the lending activity is computed based on each model outcome. • An efficiency measure is proposed, obtained as the relative position of the considered estimator from 0 % of the always wrong model to 100 % of the perfect forecasting model. • Different cutoff values are considered for both models, evidencing the subsequent different results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. A new hybrid credit scoring ensemble model with feature enhancement and soft voting weight optimization.
- Author
-
Yang, Dongqi, Xiao, Binqing, Cao, Mengya, and Shen, Huaqi
- Subjects
- *
NEGATIVE binomial distribution , *VOTING , *OUTLIER detection , *ARTIFICIAL intelligence , *FINANCIAL services industry , *DEMAND function , *BINOMIAL distribution - Abstract
The explosive development of artificial intelligence (AI) has reshaped all aspects of life, including credit scoring. At the same time, the rapid expansion of the consumer finance industry has led to a huge demand. In this study, a new hybrid ensemble model with feature enhancement and soft voting weight optimization is proposed to achieve superior predictive power for credit scoring. For mining and characterizing the implicit information of the features, a new voting-based feature enhancement method is proposed to adaptively integrate the outlier detection and clustering capabilities through the weighted voting mechanism to form a feature-enhanced training set. To balance the feature-enhanced training set precisely and effectively, a new bagging-based undersampling method is proposed to obtain a balanced training set by undersampling from the negative binomial distribution through the bagging strategy. To maximize the performance of the model, a new weight-optimized soft voting method is proposed to optimize the soft voting weights of the base classifiers in the classifier ensemble using the COBYLA algorithm and then constructing the stacking-based ensemble model. Five datasets and five evaluation indicators were used for evaluation. The experimental results demonstrate the superior performance of the proposed model and prove its robustness and effectiveness. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. Flexible loss functions for binary classification in gradient-boosted decision trees: An application to credit scoring.
- Author
-
Mushava, Jonah and Murray, Michael
- Subjects
- *
DECISION trees , *MACHINE learning , *EXTREME value theory , *SKEWNESS (Probability theory) , *DATABASES - Abstract
This paper introduces new flexible loss functions for binary classification in Gradient-Boosted Decision Trees (GBDT) that combine Dice-based and cross-entropy-based losses and offer link functions from either a generalized extreme value (GEV) or exponentiated exponential logistic (EEL) distribution. Testing 27 different GBDT models using XGBoost on a Freddie Mac mortgage loan database showed that the choice of the loss function is useful. Specifically, when the class imbalance ratio (IR) is less than 99, using a skewed GEV distribution-based link function in XGBoost enhances discriminatory power and classification accuracy while retaining a simple model structure, which is particularly important in credit scoring applications. In cases where class imbalances are severe, typically between IRs of 99 and 200, we found that an advanced loss function, which is composed of a symmetric hybrid loss function and a link derived from a positively skewed EEL distribution, outperforms other XGBoost variants. Based on our findings, the accuracy improvements of these proposed extensions result in lower misclassification costs, which are especially evident when IR is below 99, which results in higher profitability for the business. Furthermore, the study highlights the transparency associated with GBDT, which is also an integral component of financial applications. Researchers and practitioners can use these insights to create more accurate and discriminative machine learning models, with possible extensions to other GBDT implementations and machine learning techniques that take into account loss functions. The source code for the proposed approach is publicly available at https://github.com/jm-ml/flexible-losses-for-binary-classification-with-GBDT. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.