6,597 results on '"xgboost"'
Search Results
2. A Systematic Review of Machine Learning Based Models for Early Diabetes Prediction
- Author
-
Malchi, Sunil Kumar, Davanam, Ganesh, Neelima, P., Sravanthi, T. Lakshmi, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Kumar, Amit, editor, Gunjan, Vinit Kumar, editor, Senatore, Sabrina, editor, and Hu, Yu-Chen, editor
- Published
- 2025
- Full Text
- View/download PDF
3. Advancing Multilingual Sentiment Understanding with XGBoost, SVM, and XLM-RoBERTa
- Author
-
Gaikwad, Arya, Belhekar, Pranav, Kottawar, Vinayak, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Kumar, Amit, editor, Gunjan, Vinit Kumar, editor, Senatore, Sabrina, editor, and Hu, Yu-Chen, editor
- Published
- 2025
- Full Text
- View/download PDF
4. Deep Learning-Based Maize Crop Disease Detection and Remedial Recommendation System
- Author
-
Chawla, Priyanka, Nagaraju, M., Pasikanti, Meghana, Kumar, Vinay, Dasari, Suma, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Rawat, Sanyog, editor, Kumar, Arvind, editor, Raman, Ashish, editor, Kumar, Sandeep, editor, and Pathak, Parul, editor
- Published
- 2025
- Full Text
- View/download PDF
5. Extracting Daily Aggregate Load Profiles from Monthly Consumption
- Author
-
Saraf, Anmol, Kowli, Anupama, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Jørgensen, Bo Nørregaard, editor, Ma, Zheng Grace, editor, Wijaya, Fransisco Danang, editor, Irnawan, Roni, editor, and Sarjiya, Sarjiya, editor
- Published
- 2025
- Full Text
- View/download PDF
6. An Optimal Feature Selection-Based Approach to Predict Cervical Cancer Using Machine Learning
- Author
-
Al Mamun, Abdullah, Uddin, Khandaker Mohammad Mohi, Chakrabarti, Anamika, Nur-A-Alam, Md., Mahbubur Rahman, Md., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Mahmud, Mufti, editor, Kaiser, M. Shamim, editor, Bandyopadhyay, Anirban, editor, Ray, Kanad, editor, and Al Mamun, Shamim, editor
- Published
- 2025
- Full Text
- View/download PDF
7. Research on Simulation and Prediction of Photovoltaic Power Generation Based on Radiation Models and Machine Learning Method
- Author
-
Gao, Jie, Wang, Xu, Gu, Jianwei, Tang, Siwei, Zhu, Fangliang, Li, Jingyi, Zhu, Yiming, di Prisco, Marco, Series Editor, Chen, Sheng-Hong, Series Editor, Vayas, Ioannis, Series Editor, Kumar Shukla, Sanjay, Series Editor, Sharma, Anuj, Series Editor, Kumar, Nagesh, Series Editor, Wang, Chien Ming, Series Editor, Cui, Zhen-Dong, Series Editor, Lu, Xinzheng, Series Editor, Zheng, Sheng’an, editor, Taylor, Richard M., editor, Wu, Wenhao, editor, Nilsen, Bjorn, editor, and Zhao, Gensheng, editor
- Published
- 2025
- Full Text
- View/download PDF
8. Customer Churn Rate Prediction Using Machine Learning Techniques for E-Commerce Sector
- Author
-
Saxena, Muskan, Aggarwal, Nikita, Gupta, Rekha, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Hassanien, Aboul Ella, editor, Anand, Sameer, editor, Jaiswal, Ajay, editor, and Kumar, Prabhat, editor
- Published
- 2025
- Full Text
- View/download PDF
9. Missing Meteorological Data Imputation for Mini Eolic Electrical Power Prediction
- Author
-
García-Ordás, María Teresa, Díaz-Longueira, Antonio, Michelena, Álvaro, Jove, Esteban, Bayón-Gutiérrez, Martín, Alaiz-Moretón, Héctor, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Quintián, Héctor, editor, Corchado, Emilio, editor, Troncoso Lora, Alicia, editor, Pérez García, Hilde, editor, Jove Pérez, Esteban, editor, Calvo Rolle, José Luis, editor, Martínez de Pisón, Francisco Javier, editor, García Bringas, Pablo, editor, Martínez Álvarez, Francisco, editor, Herrero, Álvaro, editor, and Fosci, Paolo, editor
- Published
- 2025
- Full Text
- View/download PDF
10. OurRealtySpace -A Machine-Learning Based Investment Recommendation System
- Author
-
Divya, N., Sindhuja, Soma, Vineela, Sripada, Shreeya, Thota V. N. Reva, Abhinaya, Mandela, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Kumar, Amit, editor, Gunjan, Vinit Kumar, editor, Senatore, Sabrina, editor, and Hu, Yu-Chen, editor
- Published
- 2025
- Full Text
- View/download PDF
11. Accident Severity Detection Using Machine Learning Algorithms
- Author
-
Kumar, B. Naveen, Kumar, N. Sunil, Kumar, U. Naresh, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Kumar, Amit, editor, Gunjan, Vinit Kumar, editor, Senatore, Sabrina, editor, and Hu, Yu-Chen, editor
- Published
- 2025
- Full Text
- View/download PDF
12. Hybrid Deep Learning Model for Pancreatic Cancer Image Segmentation
- Author
-
Bakasa, Wilson, Kwenda, Clopas, Viriri, Serestina, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Proietto Salanitri, Federica, editor, Viriri, Serestina, editor, Bağcı, Ulaş, editor, Tiwari, Pallavi, editor, Gong, Boqing, editor, Spampinato, Concetto, editor, Palazzo, Simone, editor, Bellitto, Giovanni, editor, Zlatintsi, Nancy, editor, Filntisis, Panagiotis, editor, Lee, Cecilia S., editor, and Lee, Aaron Y., editor
- Published
- 2025
- Full Text
- View/download PDF
13. Anomaly Detection of Residential Electricity Consumption Based on Ensemble Model of PSO-AE-XGBOOST
- Author
-
Liu, Hao, Shi, Jiachuan, Fu, Rao, Zhang, Yanling, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Zhang, Haijun, editor, Li, Xianxian, editor, Hao, Tianyong, editor, Meng, Weizhi, editor, Wu, Zhou, editor, and He, Qian, editor
- Published
- 2025
- Full Text
- View/download PDF
14. Application of Machine Learning in Enterprise Financial Risk Assessment: A Study About China’s A-Share Listed Manufacturing Companies
- Author
-
Yu, Kexin, Yu, Zengyi, Ma, Shuomin, Xu, Pan, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Zhang, Haijun, editor, Li, Xianxian, editor, Hao, Tianyong, editor, Meng, Weizhi, editor, Wu, Zhou, editor, and He, Qian, editor
- Published
- 2025
- Full Text
- View/download PDF
15. Application of intellectual capital in SME bankruptcy.
- Author
-
Papíková, Lenka and Papík, Mário
- Subjects
FINANCIAL ratios ,FINANCIAL performance ,FINANCIAL statements ,DATA mining ,FINANCIAL institutions ,INTELLECTUAL capital - Abstract
Previous studies indicate that applying solely financial ratios (FR) provides limited SME bankruptcy prediction performance. On the other hand, the application of non-financial features is cost-ineffective for SMEs. Intellectual capital (IC) features provide a meaningful alternative to analyse SMEs' financial performance since companies with higher IC regularly achieve consistently higher sales growth. This paper aims to examine the possibilities of applying intellectual capital features in predicting SME bankruptcy. 14 IC features and 27 FR of 54,003 SMEs from 2016 to 2021 were collected from financial statements. Three groups of XGBoost and CatBoost models were developed – with only IC features, with only FR and combining FR and IC features. The results show that IC features are a practical addition to FR, with the best AUC equal to 89%, and their combination outperforms models using only FR by an average of 2.3%. Moreover, IC features such as capital employed efficiency and structural capital reduced the likelihood of SMEs' bankruptcy. The implication of this study is that SME bankruptcy models perform better using IC features without significantly increasing financial cost or processing time, which can be helpful for financial institutions. Additionally, this can contribute to developing other methods of measuring IC. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. A novel IoT-integrated ensemble learning approach for indoor air quality enhancement.
- Author
-
Kareem Abed Alzabali, Saja, Bastam, Mostafa, and Ataie, Ehsan
- Subjects
- *
MACHINE learning , *INDOOR air quality , *AIR quality monitoring , *STANDARD deviations , *PARTICULATE matter , *ATMOSPHERIC carbon dioxide , *LIQUEFIED petroleum gas - Abstract
In indoor environments, air quality significantly impacts human health and well-being, with carbon monoxide (CO) posing a particular hazard due to its colorless and odorless nature and potential to cause severe health issues. Integrating the Internet of Things and remote sensing technologies has revolutionized data monitoring, collection, and evaluation, especially within the context of 'smart' homes. This study leverages these technologies to enhance indoor air quality monitoring. By collecting data on key indoor atmospheric quality indicators—carbon dioxide (CO2), methane (CH4), alcohol, liquefied petroleum gas (LPG), particulate matter (PM1 and PM2.5), humidity, and temperature—the study aims to predict indoor carbon monoxide levels. A custom dataset was compiled from August to October, consisting of 61,710 observations recorded at one-minute intervals. The methodology employs a stacking ensemble approach, integrating multiple machine learning models to boost prediction accuracy and reliability. In the stacking ensemble, six distinct models are employed: Random Forest, Multi-Layer Perceptron, Lasso, Elastic Net, XGBoost, and Support Vector Regression. Each model is individually trained and fine-tuned using the Grid Search method to optimize parameter combinations. These optimized models are then combined in the stacking ensemble, which achieves a Mean Squared Error (MSE) of 0.0140, a Root Mean Squared Error (RMSE) of 0.1185, and a Mean Absolute Error (MAE) of 0.0291. The results demonstrate that the proposed system significantly enhances the precision of CO prediction, underscoring its critical role in air quality surveillance within smart environments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. XGBoost machine learning assisted prediction of the mechanical and fracture properties of unvulcanized and dynamically vulcanized PP/EPDM reinforced with clay and halloysite nanoparticles.
- Author
-
Rajaee, Pouya, Rabiee, Amir Hossein, Ashenai Ghasemi, Faramarz, Fasihi, Mohammad, Mahabadifar, Mahdi, and Nedaei Shekarab, Mahmoud
- Abstract
Polymer nanocomposites have found wide industrial applications, necessitating optimal mechanical and fracture properties evaluation, traditionally done through costly experimental methods. This study employs machine learning, particularly XGBoost, to predict properties like tensile and fracture properties swiftly, aiding material innovation across industries. The research investigates unvulcanized and vulcanized polypropylene (PP)/ethylene propylene diene monomer (EPDM) reinforced with clay and halloysite nanoparticles (HNT), analyzing fracture properties via essential work of fracture (EWF). Experimental design selects tests, and an XGBoost model predicts tensile strength and modulus, strain at break, EWF, and non‐EWF based on EPDM and nanoparticle percentages, composite and nanoparticle types. The model accurately predicts tensile strength and modulus but less so for strain at break, EWF, and non‐EWF. Mean Absolute Percentage Error values for training/test are 0.49/1.21, 1.05/1.55, 34.21/42.76, 3.02/14.35, and 2.89/3.78, with determination coefficients of 0.99/0.98, 0.99/0.97, 0.97/0.91, 0.97/0.79, and 0.92/0.73. Nanoparticles mainly affect outputs, with EPDM secondarily impactful, while composite and nanoparticle types exhibit similar significance. The best‐performing polymer nanocomposite is a dynamically vulcanized one containing 10 wt% EPDM and 3 wt% clay, achieving tensile strength of 25.070 MPa, tensile modulus of 261.170 MPa, EWF of 75.300 N/mm, and non‐EWF of 10.150 N/mm2. Highlights: The effects of ethylene propylene diene monomer (EPDM), clay and halloysite nanoparticles on the mechanics of polypropylene‐based nanocomposites.Essential work of fracture (EWF) was used to study the fracture properties.Machine learning was employed to predict all mechanical characteristics.The vulcanization process improved all mechanical characteristics.The best compound: vulcanized one containing 10 wt% EPDM and 3 wt% clay. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Predicting cerebral edema in patients with spontaneous intracerebral hemorrhage using machine learning.
- Author
-
Xu, Jiangbao, Yuan, Cuijie, Yu, Guofeng, Li, Hao, Dong, Qiutong, Mao, Dandan, Zhan, Chengpeng, and Yan, Xinjiang
- Abstract
Background: The early prediction of cerebral edema changes in patients with spontaneous intracerebral hemorrhage (SICH) may facilitate earlier interventions and result in improved outcomes. This study aimed to develop and validate machine learning models to predict cerebral edema changes within 72 h, using readily available clinical parameters, and to identify relevant influencing factors. Methods: An observational study was conducted between April 2021 and October 2023 at the Quzhou Affiliated Hospital of Wenzhou Medical University. After preprocessing the data, the study population was randomly divided into training and internal validation cohorts in a 7:3 ratio (training: N = 150; validation: N = 65). The most relevant variables were selected using Support Vector Machine Recursive Feature Elimination (SVM-RFE) and Least Absolute Shrinkage and Selection Operator (LASSO) algorithms. The predictive performance of random forest (RF), GDBT, linear regression (LR), and XGBoost models was evaluated using the area under the receiver operating characteristic curve (AUROC), precision–recall curve (AUPRC), accuracy, F1-score, precision, recall, sensitivity, and specificity. Feature importance was calculated, and the SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) methods were employed to explain the top-performing model. Results: A total of 84 (39.1%) patients developed cerebral edema changes. In the validation cohort, GDBT outperformed LR and RF, achieving an AUC of 0.654 (95% CI: 0.611–0.699) compared to LR of 0.578 (95% CI, 0.535–0.623, DeLong: p = 0.197) and RF of 0.624 (95% CI, 0.588–0.687, DeLong: p = 0.236). XGBoost also demonstrated similar performance with an AUC of 0.660 (95% CI, 0.611–0.711, DeLong: p = 0.963). However, in the training set, GDBT still outperformed XGBoost, with an AUC of 0.603 ± 0.100 compared to XGBoost of 0.575 ± 0.096. SHAP analysis revealed that serum sodium, HDL, subarachnoid hemorrhage volume, sex, and left basal ganglia hemorrhage volume were the top five most important features for predicting cerebral edema changes in the GDBT model. Conclusion: The GDBT model demonstrated the best performance in predicting 72-h changes in cerebral edema. It has the potential to assist clinicians in identifying high-risk patients and guiding clinical decision-making. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Interpretable prediction of 30-day mortality in patients with acute pancreatitis based on machine learning and SHAP.
- Author
-
Li, Xiaojing, Tian, Yueqin, Li, Shuangmei, Wu, Haidong, and Wang, Tong
- Abstract
Background: Severe acute pancreatitis (SAP) can be fatal if left unrecognized and untreated. The purpose was to develop a machine learning (ML) model for predicting the 30-day all-cause mortality risk in SAP patients and to explain the most important predictors. Methods: This research utilized six ML methods, including logistic regression (LR), k-nearest neighbors(KNN), support vector machines (SVM), naive Bayes (NB), random forests(RF), and extreme gradient boosting(XGBoost), to construct six predictive models for SAP. An extensive evaluation was conducted to determine the most effective model and then the Shapley Additive exPlanations (SHAP) method was applied to visualize key variables. Utilizing the optimized model, stratified predictions were made for patients with SAP. Further, the study employed multivariable Cox regression analysis and Kaplan-Meier survival curves, along with subgroup analysis, to explore the relationship between the machine learning-based score and 30-day mortality. Results: Through LASSO regression and recursive feature elimination (RFE), 25 optimal feature variables are selected. The XGBoost model performed best, with an area under the curve (AUC) of 0.881, a sensitivity of 0.5714, a specificity of 0.9651 and an F1 score of 0.64. The first six most important feature variables were the use of vasopressor, high Charlson comorbidity index, low blood oxygen saturation, history of malignant tumor, hyperglycemia and high APSIII score. Based on the optimal threshold of 0.62, patients were divided into high and low-risk groups, and the 30-day survival rate in the high-risk group decreased significantly. COX regression analysis further confirmed the positive correlation between high-risk scores and 30-day mortality. In the subgroup analysis, the model showed good risk stratification ability in patients with different gender, renal replacement therapy and with or without a history of malignant tumor, but it was not effective in predicting peripheral vascular disease. Conclusions: the XGBoost model effectively predicts the severity of SAP, serving as a valuable tool for clinicians to identify SAP early. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Association between triglyceride-glucose index and fractional exhaled nitric oxide in adults with asthma from NHANES 2007–2012.
- Author
-
Pan, Yao, Wu, Lizhen, Yao, Shiyi, Xia, Jing, Giri, Mohan, Wen, Jun, and Zhuang, Sanmei
- Abstract
Background: Several studies have shown a potential relationship between triglyceride-glucose index (TGI) and asthma. However, limited research has been conducted on the relationship between TGI and fractional exhaled nitric oxide (FeNO). Methods: A total of 1,910 asthmatic individuals from the National Health and Nutrition Examination Survey (NHANES) database were included in this study. Linear regression analyses were used to investigate the relationship between TGI and FeNO in patients with asthma. Subsequently, a trend test was applied to verify whether there was a linear relationship between the TGI and FeNO. Finally, a subgroup analysis was performed to confirm the relationship among the different subgroup populations. Results: Multivariable linear regression analyses showed that TGI was linearly related to FeNO in the asthmatic population. The trend test additionally validated the positive linear relationship between TGI and FeNO. The result of XGBoost revealed the five most influential factors on FeNO in a ranking of contrasted importance: eosinophil (EOS), body mass index (BMI), poverty-to-income ratio (PIR), TGI, and white blood cell count (WBC). Conclusions: This investigation revealed a positive linear relationship between TGI and FeNO in patients with asthma. This finding suggests a potential relationship between TGI and airway inflammation in patients with asthma, thereby facilitating the prompt identification of irregularities and providing a basis for clinical decision making. This study provides a novel perspective on asthma management. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. A novel spatio-temporal attention-based bidirectional LSTM model for moisture content prediction in drying process.
- Author
-
Zhang, Lei, Ren, Guofeng, Du, Jinsong, Li, Shanlian, Li, Yinhua, and Xu, Dayong
- Abstract
Accurate prediction of moisture content is significantly crucial for ensuring process stability and product quality in the cylinder drying process. However, the drying process exhibits complex spatio-temporal characteristics and strong interference, which make accurate prediction challenging for the deep learning approach. To address this issue, this article proposes a new spatio-temporal attention-based bidirectional long-short temporal memory network (STA-BiLSTM) model for accurate moisture content prediction. First, Maximum Relevance Minimum Redundancy (mRMR) is adopted to identify optimal features highly related to moisture content. Secondly, bidirectional long-short temporal memory (Bi-LSTM) network is utilized to extract temporal dependencies from the sequential data. Subsequently, spatio-temporal attention mechanisms are designed to adaptively focus on the most relevant features and timesteps, enhancing the model's generalization ability. Finally, due to the harsh industrial environment, eXtreme Gradient Boosting (XGBoost) is adapted to improve generalizability and robustness. Extensive experiments on a real industrial dataset of the drying process demonstrate that the proposed STA-BiLSTM approach significantly outperforms alternative approaches for predicting moisture content, validating its effectiveness and superiority. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Assessing the impact of climate change and reservoir operation on the thermal and ice regime of mountain rivers using the XGBoost model and wavelet analysis.
- Author
-
Fukś, Maksymilian, Kędra, Mariola, and Wiejaczka, Łukasz
- Abstract
This study presents an analysis of the influence of climatic conditions and the operation of a dam reservoir on the occurrence of ice cover and water temperature in two rivers (natural and transformed by reservoir operations) located in the Carpathian Mountains (central Europe). The analyses are based on data obtained from four hydrological and two climatological stations. The Extreme Gradient Boosting (XGBoost) machine learning model was used to quantitatively separate the effects of climate change from the effects arising from the operation of the dam reservoir. An analysis of the effects of reservoir operation on the phase synchronization between air and river water temperatures based on a continuous wavelet transform was also conducted. The analyses showed that there has been an increase in the average air temperature of the study area in November by 1.2 °C per decade (over the period 1984–2016), accompanied by an increase in winter water temperature of 0.3 °C per decade over the same period. As water and air temperatures associated with the river not influenced by the reservoir increased, there was a simultaneous reduction in the duration of ice cover, reaching nine days per decade. The river influenced by the dam reservoir showed a 1.05 °C increase in winter water temperature from the period 1994–2007 to the period 1981–1994, for which the operation of the reservoir was 65% responsible and climatic conditions were 35% responsible. As a result of the reservoir operation, the synchronization of air and water temperatures was disrupted. Increasing water temperatures resulted in a reduction in the average annual number of days with ice cover (by 27.3 days), for which the operation of the dam reservoir was 77.5% responsible, while climatic conditions were 22.5% responsible. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. State-of-the-art XGBoost, RF and DNN based soft-computing models for PGPN piles.
- Author
-
Kumar, Manish, Samui, Pijush, Kumar, Divesh Ranjan, and Asteris, Panagiotis G.
- Subjects
- *
ARTIFICIAL neural networks , *DEAD loads (Mechanics) , *MACHINE learning , *RANDOM forest algorithms , *DESCRIPTIVE statistics - Abstract
Machine learning (ML) has made significant advancements in predictive modelling across many engineering sectors. However, predicting the bearing capacity of pre-bored grouted planted nodular (PGPN) piles remains a relatively unexplored area due to the complexity of the load-bearing mechanism, pile-soil interactions, and multiple variables involved. The study utilises state-of-the-art ML techniques such as extreme gradient boosting (XGBoost), random forest (RF), gradient boosting machines (GBMs), and deep learning-based simulation models. The dataset fed into the model comprises 81 case histories of static pile load tests conducted in various regions of Vietnam. The data was validated using descriptive statistics, sensitivity analysis, correlation matrix displays, SHAP plot analysis, and regression curves, with predictive performance validated through k-fold cross-validation. Among all the models tested, XGBoost (R2 = 0.91, RMSE = 0.09) and RF (R2 = 0.82, RMSE = 0.09) performed the best, while the deep neural network also yielded satisfactory results. However, GBM was found not to be a robust model for this analysis. The performance of the models was visually analysed using Violin plot comparisons and Taylor diagrams. The outcome of this study facilitates the safe and economical designs of the eco-friendly pile. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Criticality of Nursing Care for Patients With Alzheimer's Disease in the ICU: Insights From MIMIC III Dataset.
- Author
-
Yan, Zhou, Quan, Guo, and Jia-Hui, Xue
- Subjects
- *
ALZHEIMER'S disease treatment , *ALZHEIMER'S disease risk factors , *RISK assessment , *RANDOM forest algorithms , *ALZHEIMER'S disease , *INTENSIVE care nursing , *PREDICTION models , *RECEIVER operating characteristic curves , *T-test (Statistics) , *CRITICALLY ill , *PATIENTS , *LOGISTIC regression analysis , *NURSING , *EVALUATION of medical care , *HOSPITAL mortality , *DESCRIPTIVE statistics , *CHI-squared test , *LONGITUDINAL method , *ROUTINE diagnostic tests , *INTENSIVE care units , *ELECTRONIC health records , *ARTIFICIAL neural networks , *NURSING practice , *HOSPITAL care of older people , *MACHINE learning , *QUALITY assurance , *COMPARATIVE studies , *CONFIDENCE intervals , *SENSITIVITY & specificity (Statistics) , *EVALUATION - Abstract
Alzheimer's disease (AD) patients admitted to intensive care units (ICUs) exhibit varying survival outcomes due to the unique challenges in managing AD patients. Stratifying patient mortality risk and understanding the criticality of nursing care are important to improve the clinical outcomes of AD patients. This study aimed to leverage machine learning (ML) and electronic health records (EHRs) only consisting of demographics, disease history, and routine lab tests, with a focus on nursing care, to facilitate the optimization of nursing practices for AD patients. We utilized Medical Information Mart for Intensive Care III, an open-source EHR dataset, and AD patients were identified based on the International Classification of Diseases, Ninth Revision codes. From a cohort of 453 patients, a total of 60 features, encompassing demographics, laboratory tests, disease history, and number of nursing events, were extracted. ML models, including XGBoost, random forest, logistic regression, and multi-layer perceptron, were trained to predict the 30-day mortality risk. In addition, the influence of nursing care was analyzed in terms of feature importance using values calculated from both the inherent XGBoost module and the SHapley Additive exPlanations (SHAP) library. XGBoost emerged as the lead model with a high accuracy of 0.730, area under the curve (AUC) of 0.750, sensitivity of 0.688, and specificity of 0.740. Feature importance analyses using inherent XGBoost module or SHAP both indicated the number of nursing care within 14 days post-admission as an important denominator for 30-day mortality risk. When nursing care events were excluded as a feature, stratifying patient mortality risk was also possible but the model's AUC of receiver operating characteristic curve was reduced to 0.68. Nursing care plays a pivotal role in the survival outcomes of AD patients in ICUs. ML models can be effectively employed to predict mortality risks and underscore the importance of specific features, including nursing care, in patient outcomes. Early identification of high-risk AD patients can aid in prioritizing intensive nursing care, potentially improving survival rates. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Variations in cuticular hydrocarbons of Calliphora vicina (Diptera: Calliphoridae) empty puparia: Insights for estimating late postmortem intervals.
- Author
-
Sharif, Swaima, Wunder, Cora, Amendt, Jens, and Qamar, Ayesha
- Subjects
- *
FORENSIC entomology , *ENVIRONMENTAL forensics , *BLOWFLIES , *ENVIRONMENTAL monitoring , *MACHINE learning - Abstract
Necrophagous flies, particularly blowflies, serve as vital indicators in forensic entomology and ecological studies, contributing to minimum postmortem interval estimations and environmental monitoring. The study investigates variations in the predominant cuticular hydrocarbons (CHCs) viz. n-C25, n-C27, n-C28, and n-C29 of empty puparia of Calliphora vicina Robineau-Desvoidy, 1830, (Diptera: Calliphoridae) across diverse environmental conditions, including burial, above-ground and indoor settings, over 90 days. Notable trends include a significant decrease in n-C25 concentrations in buried and above-ground conditions over time, while n-C27 concentrations decline in buried and above-ground conditions but remain stable indoors. Burial conditions show significant declines in n-C27 and n-C29 concentrations over time, indicating environmental influences. Conversely, above-ground conditions exhibit uniform declines in all hydrocarbons. Indoor conditions remain relatively stable, with weak correlations between weathering time and CHC concentrations. Additionally, machine learning techniques, specifically Extreme Gradient Boosting (XGBoost), are employed for age estimation of empty puparia, yielding accurate predictions across different outdoor and indoor conditions. These findings highlight the subtle responses of CHC profiles to environmental stimuli, underscoring the importance of considering environmental factors in forensic entomology and ecological research. The study advances the understanding of insect remnant degradation processes and their forensic implications. Furthermore, integrating machine learning with entomological expertise offers standardized methodologies for age determination, enhancing the reliability of entomological evidence in legal contexts and paving the way for future research and development. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. Automatic text classification of prostate cancer malignancy scores in radiology reports using NLP models.
- Author
-
Collado-Montañez, Jaime, López-Úbeda, Pilar, Chizhikova, Mariia, Díaz-Galiano, M. Carlos, Ureña-López, L. Alfonso, Martín-Noguerol, Teodoro, Luna, Antonio, and Martín-Valdivia, M. Teresa
- Abstract
This paper presents the implementation of two automated text classification systems for prostate cancer findings based on the PI-RADS criteria. Specifically, a traditional machine learning model using XGBoost and a language model-based approach using RoBERTa were employed. The study focused on Spanish-language radiological MRI prostate reports, which has not been explored before. The results demonstrate that the RoBERTa model outperforms the XGBoost model, although both achieve promising results. Furthermore, the best-performing system was integrated into the radiological company's information systems as an API, operating in a real-world environment. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Forecasting operation of a chiller plant facility using data-driven models.
- Author
-
Salimian Rizi, Behzad, Faramarzi, Afshin, Pertzborn, Amanda, and Heidarinejad, Mohammad
- Subjects
- *
MOVING average process , *INTELLIGENT agents , *CONSUMPTION (Economics) , *PREDICTION models , *ENERGY consumption - Abstract
• Predicted power consumption and COP a chiller plant using XGBoost. • Quantified impacts of data intervals and data processing on the accuracy of models. • Improved the predictions compared to the baseline using data smoothing methods. • Demonstrated a guide to develop data-driven chiller power and COP models. In recent years, data-driven models have enabled accurate prediction of chiller power consumption and chiller coefficient of performance (COP). This study evaluates the usage of time series Extreme Gradient Boosting (XGBoost) models to predict chiller power consumption and chiller COP of a water-cooled chiller plant. The 10-second measured data used in this study are from the Intelligent Building Agents Laboratory (IBAL), which includes two water-cooled chillers. Preprocessing, data selection, noise analysis, and data smoothing methods influence the accuracy of these data-driven predictions. The data intervals were changed to 30 s, 60 s, and 180 s using down-sampling and averaging strategies to investigate the impact of data preprocessing methods and data resolutions on the accuracy of chiller COP and power consumption models. To overcome the effect of noise on the accuracy of the models of chiller power consumption and COP, two data smoothing methods, the moving average window strategy and the Savitzky-Golay (SG) filter, are applied. The results show that both methods improve the predictions compared to the baseline, with the SG filter slightly outperforming the moving average. Particularly, the mean absolute percentage error of the chiller COP and power consumption models improved from 4.8 to 4.9 for the baseline to 1.9 and 2.3 with the SG filter, respectively. Overall, this study provides a practical guide to developing XGBoost data-driven chiller power consumption and COP prediction models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Analyzing dissemination, quality, and reliability of Chinese brain tumor-related short videos on TikTok and Bilibili: a cross-sectional study.
- Author
-
Ren Zhang, Zhiwei Zhang, Hui Jie, Yi Guo, Yi Liu, Yuan Yang, Chuan Li, and Chenglin Guo
- Abstract
Background: As the Internet becomes an increasingly vital source of medical information, the quality and reliability of brain tumor-related short videos on platforms such as TikTok and Bilibili have not been adequately evaluated. Therefore, this study aims to assess these aspects and explore the factors influencing the dissemination of such videos. Methods: A cross-sectional analysis was conducted on the top 100 brain tumorrelated short videos from TikTok and Bilibili. The videos were evaluated using the Global Quality Score and the DISCERN reliability instrument. An eXtreme Gradient Boosting algorithm was utilized to predict dissemination outcomes. The videos were also categorized by content type and uploader. Results: TikTok videos scored relatively higher on both the Global Quality Score (median 2, interquartile range [2, 3] on TikTok vs. median 2, interquartile range [1, 2] on Bilibili, p=1.51E-04) and the DISCERN reliability instrument (median 15, interquartile range [13, 18.25] on TikTok vs. 13.5, interquartile range [11, 16] on Bilibili, p=1.66E-04). Subgroup analysis revealed that videos uploaded by professional individuals and institutions had higher quality and reliability compared to those uploaded by non-professional entities. Videos focusing on disease knowledge exhibited the highest quality and reliability compared to other content types. The number of followers emerged as the most important variable in our dissemination prediction model. Conclusion: The overall quality and reliability of brain tumor-related short videos on TikTok and Bilibili were unsatisfactory and did not significantly influence video dissemination. Future research should expand the scope to better understand the factors driving the dissemination of medical-themed videos. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Detecting Evolving Cyber Threats in IoT Environments Using Machine Learning.
- Author
-
Khan, Firoz and Billahalli Shivabasappa, Sunil Kumar
- Subjects
CYBERTERRORISM ,MACHINE learning ,INTERNET of things ,ALGORITHMS ,INTERNET security - Abstract
In recent years, the evolution of IoT devices and applications has been accompanied by a corresponding evolution in cyberattacks targeting these systems. Traditional approaches in cybersecurity often rely on models trained using historical data, which may accurately predict known attack patterns but struggle to detect emerging threats or evolving attack strategies. To address this challenge, this study introduces a novel model named Concept-Drift XGBoost (CD-XGB). Recognizing evolving attacks as instances of concept drift, this research proposes an additional algorithm termed Improved Concept-Drift Identification (ICDI) to identify and adapt to changing attack patterns. The performance of the CD-XGB model is evaluated using three diverse datasets: UNSW-NB15 2015, HIKARI 2021, and CIC IoV 2024. The results demonstrate impressive accuracy rates of 99.96%, 99.99%, and 99.96% for the respective datasets, underscoring the effectiveness of CD-XGB in addressing the challenge of evolving cyber threats in IoT environments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Enhancing stress detection in wearable IoT devices using federated learning and LSTM based hybrid model.
- Author
-
Mouhni, Naoual, Amalou, Ibtissam, Chakri, Sana, Tourad, Mohamedou Cheikh, Chakraoui, Mohamed, and Abdali, Abdelmounaim
- Subjects
CONVOLUTIONAL neural networks ,FEDERATED learning ,BLENDED learning ,RANDOM forest algorithms ,DEEP learning - Abstract
In the domain of smart health devices, the accurate detection of physical indicators levels plays a crucial role in enhancing safety and well-being. This paper introduces a cross device federated learning framework using hybrid deep learning model. Specifically, the paper presents a comprehensive comparison of different combination of long short-term memory (LSTM), gated recurrent unit (GRU), convolutional neural network (CNN), random forest (RF), and extreme gradient boosting (XGBoost), in order to forecast stress levels by utilizing time series information derived from wearable smart gadgets. The LSTM-RF model demonstrated the highest level of accuracy, achieving 93.53% for user 1, 99.40% for user 2, and 97.88% for user 3. Similarly, the LSTM-XGBoost model yielded favorable outcomes, with accuracy rates of 85.88%, 98.55%, and 92.02% for users 1, 2, and 3, respectively, out of 23 users studied. These findings highlight the efficacy of federated learning and the utilization of hybrid models in stress detection. Unlike traditional centralized learning paradigms, the presented federated approach ensures privacy preservation and reduces data transmission requirements by processing data locally on Edge devices. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Improving imbalanced class intrusion detection in IoT with ensemble learning and ADASYN-MLP approach.
- Author
-
Soni, Muhammad Akmal Remli, Kauthar Mohd Daud, and Januar Al Amien
- Subjects
INTERNET of things ,SAMPLING (Process) ,ACTIVITIES of daily living ,CLASSIFICATION - Abstract
The exponential growth of the internet of things (IoT) has revolutionized daily activities, but it also brings forth significant vulnerabilities. intrusion detection systems (IDS) are pivotal in efficiently detecting and identifying suspicious activities within IoT networks, safeguarding them from potential threats. It proposes a ensemble approach aimed at enhancing model performance in such scenarios. Recognizing the unique challenges posed by imbalanced class distribution, the research employs three sampling techniques LightGBM adaptive synthetic sampling (ADASYN) with multilayer perceptron (MLP), XGBoost ADASYN with MLP, and LightGBM ADASyn with XGBoost to address class imbalance effectively. Evaluation confusion matrix performance metrics underscores the efficacy of ensemble models, particularly LightGBM ADASYN with MLP, XGBoost ADASYN with MLP, and LightGBM ADASYN with XGBoost, in mitigating imbalanced class issues. The LightGBM ADASYN with MLP model stands out with 99.997% accuracy, showcasing exceptional precision and recall, demonstrating its proficiency in intrusion detection within minimal false positives negatives. Despite computational demands, integrating XGBoost within ensemble frameworks yields robust intrusion detection results, highlighting a balanced trade-off between accuracy, precision, and recall. This research offers valuable insights into the strengths with different ensemble models, significantly contributing to the advancement of accurate and reliable IDS in realm of IoT. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Design & development of adulteration detection system by fumigation method & machine learning techniques.
- Author
-
Agrawal, Urvashi, Bawane, Narendra, Alsubaie, Najah, Alqahtani, Mohammed S., Abbas, Mohamed, and Soufiene, Ben Othman
- Subjects
- *
EDIBLE fats & oils , *VEGETABLE oils , *FOOD adulteration , *REFRACTIVE index , *RANDOM forest algorithms , *ADULTERATIONS - Abstract
A novel method for discovery of adulteration in edible oil is proposed based on concept of refractive index and electronic sensors. The research work focusses on two distinct methodologies like employing datasets and implementing a fumigation technique that integrates real-time hardware for testing Edible oil Impurities. In the first method, the dataset taken into consideration contains spectral data collected using Advanced ATR-MIR Spectroscopy for pure oil and various levels of adulteration with Vegetable oil. Each and every edible oil has a certain value of refractive index. When such oils are contemned in a change adding adulterants, the value of its refractive indices also changes. This value of refractive index serves as a feature for testing the oil and helps us in detecting the adulteration. If Oil is adulterated with vegetable oils, the refractive index will be lower and with animal fats, the refractive index will be higher than that of pure Oil. While in Fumigation Method a hardware module is develop in which adulterated & pure oil samples are heated at 40–50 °C for 4.66 min and the volatiles that are generated by varying gas concentrations are forcefully passed through to the MEMS Gas Sensor-MISC-2714 and Multichannel Gas sensor. The conductance of the sensors changes according to the gases sensed by the sensors contributes to features extraction. The conductance value serves as a feature for the classifier to determine whether the sample is highly, moderately, or lowly contaminated. Thus, in proposed methods we use different algorithms based on machine learning like KNN, Random Forest, CATBOOST and XGBOOST to accurately reveal the adulteration. Amongst all the applied algorithm Random Forest (RF) Classifier & XGBOOST algorithm outperform well and gives 100% accuracy. The proposed work is used for identifying food adulteration in edible food products which helps us to feed Society with high-quality food. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. From data to clean water: XGBoost and Bayesian optimization for advanced wastewater treatment with ultrafiltration.
- Author
-
Al-Jamimi, Hamdi A., BinMakhashen, Galal M., and Saleh, Tawfik A.
- Subjects
- *
WASTEWATER treatment , *SUSTAINABLE agriculture , *WATER pollution , *DATA scrubbing , *MACHINE learning - Abstract
Water pollution remains a pressing global challenge, threatening human health and ecosystem stability. Ultrafiltration emerges as a vital technology in this contest, offering a powerful tool for contaminant removal and safeguarding clean water resources. Thus, the optimization of ultrafiltration processes holds paramount significance for efficient contaminant removal. This study revolutionizes wastewater treatment by introducing a hybrid machine learning approach that optimizes ultrafiltration processes for superior contaminant removal. Utilizing the powerful synergy between eXtreme Gradient Boosting (XGBoost) and Bayesian optimization, we developed predictive models with remarkable accuracy (R2 values exceeding 99%) for post-treatment concentrations of metal ions, organic pollutants, and salts. This translates to precise control over the ultrafiltration process, driven by 4 key input variables: metal ion concentration, organic pollutants, salts, and applied pressure. The findings not only demonstrate the effectiveness of this hybrid approach but also pave the way for significant advancements in wastewater treatment strategies, ultimately contributing to cleaner water. This research marks a significant leap in machine learning applications for environmental challenges, paving the way for further advancements in wastewater treatment technology. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Exploring the correlation between DNA methylation and biological age using an interpretable machine learning framework.
- Author
-
Zhou, Sheng, Chen, Jing, Wei, Shanshan, Zhou, Chengxing, Wang, Die, Yan, Xiaofan, He, Xun, and Yan, Pengcheng
- Subjects
- *
DNA methylation , *FEATURE selection , *GENETIC transcription , *PREDICTION models , *METHYLATION - Abstract
DNA methylation plays a significant role in regulating transcription and exhibits a systematic change with age. These changes can be used to predict an individual's age. First, to identify methylation sites associated with biological age; second, to construct a biological age prediction model and preliminarily explore the biological significance of methylation-associated genes using machine learning. A biological age prediction model was constructed using human methylation data through data preprocessing, feature selection procedures, statistical analysis, and machine learning techniques. Subsequently, 15 methylation data sets were subjected to in-depth analysis using SHAP, GO enrichment, and KEGG analysis. XGBoost, LightGBM, and CatBoost identified 15 groups of methylation sites associated with biological age. The cg23995914 locus was identified as the most significant contributor to predicting biological age by calculating SHAP values. Furthermore, GO enrichment and KEGG analyses were employed to initially explore the methylated loci's biological significance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Machine learning models of cerebral oxygenation (rcSO2) for brain injury detection in neonates with hypoxic‐ischaemic encephalopathy.
- Author
-
Ashoori, Minoo, O'Toole, John M., Garvey, Aisling A., O'Halloran, Ken D., Walsh, Brian, Moore, Michael, Pavel, Andreea M., Boylan, Geraldine B., Murray, Deirdre M., Dempsey, Eugene M., and McDonald, Fiona B.
- Subjects
- *
MACHINE learning , *CONVOLUTIONAL neural networks , *NEWBORN infants , *DEEP learning , *OXYGEN saturation - Abstract
Key points The present study was designed to test the potential utility of regional cerebral oxygen saturation (rcSO2) in detecting term infants with brain injury. The study also examined whether quantitative rcSO2 features are associated with grade of hypoxic ischaemic encephalopathy (HIE). We analysed 58 term infants with HIE (>36 weeks of gestational age) enrolled in a prospective observational study. All newborn infants had a period of continuous rcSO2 monitoring and magnetic resonance imaging (MRI) assessment during the first week of life. rcSO2 Signals were pre‐processed and quantitative features were extracted. Machine‐learning and deep‐learning models were developed to detect adverse outcome (brain injury on MRI or death in the first week) using the leave‐one‐out cross‐validation approach and to assess the association between rcSO2 and HIE grade (modified Sarnat – at 1 h). The machine‐learning model (rcSO2 excluding prolonged relative desaturations) significantly detected infant MRI outcome or death in the first week of life [area under the curve (AUC) = 0.73, confidence interval (CI) = 0.59−0.86, Matthew's correlation coefficient = 0.35]. In agreement, deep learning models detected adverse outcome with an AUC = 0.64, CI = 0.50−0.79. We also report a significant association between rcSO2 features and HIE grade using a machine learning approach (AUC = 0.81, CI = 0.73−0.90). We conclude that automated analysis of rcSO2 using machine learning methods in term infants with HIE was able to determine, with modest accuracy, infants with adverse outcome.
De novo approaches to signal analysis of NIRS holds promise to aid clinical decision making in the future. Hypoxic‐induced neonatal brain injury contributes to both short‐ and long‐term functional deficits. Non‐invasive continuous monitoring of brain oxygenation using near‐infrared‐ spectroscopy offers a potential new insight to the development of serious injury. In this study, characteristics of the NIRS signal were summarised using either predefined features or data‐driven feature extraction, both were combined with a machine learning approach to predict short‐term brain injury. Using data from a cohort of term infants with hypoxic ischaemic encephalopathy, the present study illustrates that automated analysis of regional cerebral oxygen saturation rcSO2, using either machine learning or deep learning methods, was able to determine infants with adverse outcome. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
36. Predicting cerebral edema in patients with spontaneous intracerebral hemorrhage using machine learning.
- Author
-
Jiangbao Xu, Cuijie Yuan, Guofeng Yu, Hao Li, Qiutong Dong, Dandan Mao, Chengpeng Zhan, and Xinjiang Yan
- Subjects
MACHINE learning ,CEREBRAL edema ,RECEIVER operating characteristic curves ,CEREBRAL hemorrhage ,SUPPORT vector machines - Abstract
Background: The early prediction of cerebral edema changes in patients with spontaneous intracerebral hemorrhage (SICH) may facilitate earlier interventions and result in improved outcomes. This study aimed to develop and validate machine learning models to predict cerebral edema changes within 72 h, using readily available clinical parameters, and to identify relevant influencing factors. Methods: An observational study was conducted between April 2021 and October 2023 at the Quzhou Affiliated Hospital of Wenzhou Medical University. After preprocessing the data, the study population was randomly divided into training and internal validation cohorts in a 7:3 ratio (training: N = 150; validation: N = 65). The most relevant variables were selected using Support Vector Machine Recursive Feature Elimination (SVM-RFE) and Least Absolute Shrinkage and Selection Operator (LASSO) algorithms. The predictive performance of random forest (RF), GDBT, linear regression (LR), and XGBoost models was evaluated using the area under the receiver operating characteristic curve (AUROC), precision--recall curve (AUPRC), accuracy, F1-score, precision, recall, sensitivity, and specificity. Feature importance was calculated, and the SHapley Additive exPlanations (SHAP) and Local Interpretable Model-Agnostic Explanations (LIME) methods were employed to explain the top-performing model. Results: A total of 84 (39.1%) patients developed cerebral edema changes. In the validation cohort, GDBT outperformed LR and RF, achieving an AUC of 0.654 (95% CI: 0.611-0.699) compared to LR of 0.578 (95% CI, 0.535-0.623, DeLong: p = 0.197) and RF of 0.624 (95% CI, 0.588-0.687, DeLong: p = 0.236). XGBoost also demonstrated similar performance with an AUC of 0.660 (95% CI, 0.611-0.711, DeLong: p = 0.963). However, in the training set, GDBT still outperformed XGBoost, with an AUC of 0.603 ± 0.100 compared to XGBoost of 0.575 ± 0.096. SHAP analysis revealed that serum sodium, HDL, subarachnoid hemorrhage volume, sex, and left basal ganglia hemorrhage volume were the top five most important features for predicting cerebral edema changes in the GDBT model. Conclusion: The GDBT model demonstrated the best performance in predicting 72-h changes in cerebral edema. It has the potential to assist clinicians in identifying high-risk patients and guiding clinical decision-making. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Combined Drought Index Using High-Resolution Hydrological Models and Explainable Artificial Intelligence Techniques in Türkiye.
- Author
-
Başakın, Eyyup Ensar, Stoy, Paul C., Demirel, Mehmet Cüneyd, Ozdogan, Mutlu, and Otkin, Jason A.
- Subjects
- *
REMOTE sensing , *CROP yields , *ARTIFICIAL intelligence , *SOIL moisture , *HYDROLOGIC models - Abstract
We developed a combined drought index to better monitor agricultural drought events. To develop the index, different combinations of the temperature condition index, precipitation condition index, vegetation condition index, soil moisture condition index, gross primary productivity, and normalized difference water index were used to obtain a single drought severity index. To obtain more effective results, a mesoscale hydrologic model was used to obtain soil moisture values. The SHapley Additive exPlanations (SHAP) algorithm was used to calculate the weights for the combined index. To provide input to the SHAP model, crop yield was predicted using a machine learning model, with the training set yielding a correlation coefficient (R) of 0.8, while the test set values were calculated to be 0.68. The representativeness of the new index in drought situations was compared with established indices, including the Standardized Precipitation-Evapotranspiration Index (SPEI) and the Self-Calibrated Palmer Drought Severity Index (scPDSI). The index showed the highest correlation with an R-value of 0.82, followed by the SPEI with 0.7 and scPDSI with 0.48. This study contributes a different perspective for effective detection of agricultural drought events. The integration of an increased volume of data from remote sensing systems with technological advances could facilitate the development of significantly more efficient agricultural drought monitoring systems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. Hybrid machine learning approach for accurate prediction of the drilling rock index.
- Author
-
Shahani, Niaz Muhammad, Zheng, Xigui, Wei, Xin, and Hongwei, Jiang
- Subjects
- *
ROCK excavation , *STANDARD deviations , *OPTIMIZATION algorithms , *MINING engineering , *SUPPORT vector machines - Abstract
The drilling rate index (DRI) of rocks is important for optimizing drilling operations, as it informs the choice of appropriate methods and equipment, ultimately improving the efficiency of rock excavation projects. This study presents a hybrid machine learning approach to predict the DRI of rocks accurately. By integrating grey wolf optimization with support vector machine (GWO-SVM), random forest (GWO-RF), and extreme gradient boosting (GWO-XGBoost) models, the aim was to enhance predictive accuracy. Among these, the GWO-XGBoost model exhibited superior predictive performance, achieving a coefficient of determination (R²) of 0.999, mean absolute error (MAE) of 0.00043, root mean square error (RMSE) of 1.98017, and severity index (SI) of 0.0350 during training. Testing results confirmed its accuracy with R² of 0.999, MAE of 0.00038, RMSE of 1.80790, and SI of 0.0312. Furthermore, the GWO-XGBoost model outperformed the other models in terms of precision, recall, f1-score, and multi-class confusion matrix results for each DRI class. The GWO-RF model also demonstrated high accuracy, ranking second, while the GWO-SVM model showed comparatively lower performance. This research aims to advance rock excavation practices by providing a highly accurate and reliable tool for DRI prediction. The results highlight the significant potential of the GWO-XGBoost model in improving DRI predictions, offering valuable intuitions and practical applications in the field. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. An effective method for anomaly detection in industrial Internet of Things using XGBoost and LSTM.
- Author
-
Chen, Zhen, Li, ZhenWan, Huang, Jia, Liu, ShengZheng, and Long, HaiXia
- Subjects
- *
MACHINE learning , *RECEIVER operating characteristic curves , *ANOMALY detection (Computer security) , *FEATURE selection , *INTERNET of things , *DEEP learning - Abstract
In recent years, with the application of Internet of Things (IoT) and cloud technology in smart industrialization, Industrial Internet of Things (IIoT) has become an emerging hot topic. The increasing amount of data and device numbers in IIoT poses significant challenges to its security issues, making anomaly detection particularly important. Existing methods for anomaly detection in the IIoT often fall short when dealing with data imbalance, and the huge amount of IIoT data makes feature selection challenging and computationally intensive. In this paper, we propose an optimal deep learning model for anomaly detection in IIoT. Firstly, by setting different thresholds of eXtreme Gradient Boosting (XGBoost) for feature selection, features with importance above the given threshold are retained, while those below are ignored. Different thresholds yield different numbers of features. This approach not only secures effective features but also reduces the feature dimensionality, thereby decreasing the consumption of computational resources. Secondly, an optimized loss function is designed to study its impact on model performance in terms of handling imbalanced data, highly similar categories, and model training. We select the optimal threshold and loss function, which are part of our optimal model, by comparing metrics such as accuracy, precision, recall, False Alarm Rate (FAR), Area Under the Receiver Operating Characteristic Curve (AUC-ROC), and Area Under the Precision–Recall Curve (AUC-PR) values. Finally, combining the optimal threshold and loss function, we propose a model named MIX_LSTM for anomaly detection in IIoT. Experiments are conducted using the UNSW-NB15 and NSL-KDD datasets. The proposed MIX_LSTM model can achieve 0.084 FAR, 0.984 AUC-ROC, and 0.988 AUC-PR values in the binary anomaly detection experiment on the UNSW-NB15 dataset. In the NSL-KDD dataset, it can achieve 0.028 FAR, 0.967 AUC-ROC, and 0.962 AUC-PR values. By comparing the evaluation indicators, the model shows good performance in detecting abnormal attacks in the Industrial Internet of Things compared with traditional deep learning models, machine learning models and existing technologies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. A SHAP-enhanced XGBoost model for interpretable prediction of coseismic landslides.
- Author
-
Wen, Haijia, Liu, Bo, Di, Mingrui, Li, Jiayi, and Zhou, Xinzhi
- Subjects
- *
MACHINE learning , *LANDSLIDE prediction , *FIELD research , *LANDSLIDES , *PREDICTION models , *ALGORITHMS , *LANDSLIDE hazard analysis - Abstract
• Thoroughly compare XGBoost, RF, LR, and SVM models optimized by TPE algorithm. • Introduce SHAP algorithm to CLHA; global interpretation reveals factor importance. • Combine local unit interpretations and field surveys to reveal correlations. In the coseismic landslide hazard assessment (CLHA), advanced machine learning (ML) models have garnered significant attention due to their effectiveness in handling the complex relationships between landslides and various influencing factors. However, explaining the decision-making mechanisms of machine learning models that predict landslide spatial distribution based on these influencing factors remains challenging. This study compares the predictive performance of four models—XGBoost, RF, LR, and SVM—optimized using the TPE algorithm, and introduces the SHAP algorithm into the XGBoost model to achieve both global and local interpretations of CLHA. In various tests, the optimized XGBoost model demonstrated the best predictive performance, achieving an accuracy of 0.864 and an AUC value of 0.886. Global interpretations indicate that the occurrence of coseismic landslides is primarily influenced by triggering factors such as hypocentral distance and distance from the seismogenic fault. Terrain roughness and elevation, on the other hand, make significant contributions among the conditioning factors. Single-factor dependence plots indicate that the contribution of individual factors to landslides varies across different ranges of their feature values. Analysis of two-factor dependence plots reveals that interactions between factors are also crucial in influencing the occurrence of landslides. Combining field surveys with local interpretations confirms significant variations in the contributions of influencing factors within local ranges. The main innovation of this study lies in the integration of the SHAP algorithm into the CLHA model, revealing the decision-making mechanisms of the model for spatial prediction of coseismic landslides. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Identification of texture MRI brain abnormalities on Fibromyalgia syndrome using interpretable machine learning models.
- Author
-
Jiang, Hongyang, Liu, Aihui, and Ying, Zhenhua
- Subjects
- *
MACHINE learning , *RECEIVER operating characteristic curves , *FEATURE extraction , *MANN Whitney U Test , *BRAIN abnormalities - Abstract
To provide objective diagnostic markers for fibromyalgia symptoms (FMS) diagnosis, we have created interpretable extreme gradient boosting (XGBoost) models using radiomics to aid in the diagnosis of chronic pain (CP) and to develop nomogram models for diagnosing subgroups of FMS. A group of 54 patients with CP and 71 healthy controls was randomly separated into training and validation groups, using a 7:3 ratio. Radiomics features were extracted from grey-matter and white-matter in the filtered mwp0* image. The Mann-Whitney U test, Spearman's rank correlation test, and least absolute shrinkage and selection operator (LASSO) were utilized to select features. An XGBoost model was created based on these features, and Shapley Additive exPlanations (SHAP) was used for personalization and visual interpretation. A nomogram was developed for the diagnosis of FMS subgroups, utilizing radiomics scores and clinical predictors. The efficacy of the nomogram was evaluated using the area under the receiver operating characteristic curve, while decision curve analysis was employed to evaluate its clinical efficacy. The XGBoost model displays stability in the training validation group, indicating lower overfitting of CP model. The nomogram model combined with the rad-score has a greater ability to distinguish between typical and sub-clinical than the clinical factor model alone. We developed and validated a CP diagnosis model by XGBoost and realized model visualization through SHAP. The rad-score obtained by machine learning was used to build a nomogram model that combines clinical scales to distinguish patients with typical and sub-clinical fibromyalgia. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. A Deep Learning Model for Predicting the Laminar Burning Velocity of NH 3 /H 2 /Air.
- Author
-
Yue, Wanying, Zhang, Bin, Zhang, Siqi, Wang, Boqiao, Xia, Yuanchen, and Liang, Zhuohui
- Subjects
BURNING velocity ,FEATURE extraction ,COMPOSITE structures ,PREDICTION models ,COMBUSTION ,DEEP learning - Abstract
Both NH
3 and H2 are considered to be carbon-free fuels, and their mixed combustion has excellent performance. Considering the laminar burning velocity as a key characteristic of fuels, accurately predicting the laminar burning velocity of NH3 /H2 /Air is crucial for its combustion applications. The study made improvements to the XGBoost model and developed NH3 /H2 /Air Laminar Burning Velocity Net (NHLBVNet), which adopts a composite hierarchical structure to connect the functions of feature extraction, feature combination, and model prediction. The dataset consists of 487 sets of experimental data after the exclusion of outliers. The correlation coefficient ( R 2 > 0.99) of NHLBVNet is higher than that of the XGBoost model ( R 2 > 0.93). Robustness experiment results indicate that this model can obtain more accurate prediction results than other models even under small sample datasets. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
43. An Investigation into the Susceptibility to Landslides Using Integrated Learning and Bayesian Optimization: A Case Study of Xichang City.
- Author
-
Xing, Fucheng, Li, Ning, Zhao, Boju, Xiang, Han, and Chen, Yutao
- Abstract
In the middle southern section of the Freshwater River–Small River Fault system, Xichang City, Daliang Prefecture, Sichuan Province, is situated in the junction between the Anning River Fault and the Zemu River Fault. There has been a risk of increased activity in the fault zone in recent years, and landslide susceptibility evaluation for the area can effectively reduce the risk of disaster occurrence. Using integrated learning and Bayesian hyperparameter optimization, 265 landslides in Xichang City were used as samples in this study. Thirteen influencing factors were chosen to assess landslide susceptibility, and the BO-XGBoost, BO-LightGBM, and BO-RF models were evaluated using precision, recall, F1, accuracy, and AUC curves. The findings indicated that after removing the terrain relief evaluation factor, the four most significant factors associated with landslide susceptibility were NDVI, distance from faults, slope, and distance from rivers. The study demonstrates that the AUC value of the BO-XGBoost model in the study area is 0.8677, demonstrating a better generalization ability and higher prediction accuracy than the BO-LightGBM and BO-RF models. After Bayesian optimization of hyperparameters, the model offers a significant improvement in prediction accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Iron Deficiency: Global Trends and Projections from 1990 to 2050.
- Author
-
Wang, Li, Liang, Dan, Huangfu, Hengqian, Shi, Xinfu, Liu, Shuang, Zhong, Panpan, Luo, Zhen, Ke, Changwen, and Lai, Yingsi
- Abstract
Background: Iron deficiency (ID) remains the leading cause of anemia, affects a vast number of persons globally, and continues to be a significant global health burden. Comprehending the patterns of ID burden is essential for developing targeted public health policies. Methods: Using data from the Global Burden of Disease (GBD) 2021 study for the years 1990–2021, the XGBoost model was constructed to predict prevalence and disability-adjusted life years (DALYs) for the period 2022–2050, based on key demographic variables. Shapley Additive exPlanations (SHAP) values were applied to interpret the contributions of each variable to the model's predictions. Additionally, the Age–Period–Cohort (APC) model was used to evaluate the effects of age, period, and birth cohort on both prevalence and DALYs. The relationship between the Socio-Demographic Index (SDI) and ID's age-standardized prevalence rate (ASPR) as well as the age-standardized DALYs rate (ASDR) was also analyzed to assess the influence of socioeconomic development on disease burden. Results: The global prevalent cases of ID grew from 984.61 million in 1990 to 1270.64 million in 2021 and are projected to reach 1439.99 million by 2050. Similarly, global DALYs from ID increased from 28.41 million in 1990 to 32.32 million in 2021, with a projected rise to 36.13 million by 2050. The ASPR declined from 18,204/100,000 in 1990 to 16,433/100,000 in 2021, with an estimated annual percentage change (EAPC) of −0.36% over this period. It is expected to decrease further to 15,922 by 2050, with an EAPC of −0.09% between 2021 and 2050. The ASDR was 518/100,000 in 1990 and 424/100,000 in 2021, with an EAPC of −0.68% from 1990 to 2021. It is expected to remain relatively stable at 419/100,000 by 2050, with an EAPC of −0.02% between 2021 and 2050. In 2021, the highest ASPRs were recorded in Senegal (34,421/100,000), Mali (34,233/100,000), and Pakistan (33,942/100,000). By 2050, Mali (35,070/100,000), Senegal (34,132/100,000), and Zambia (33,149/100,000) are projected to lead. For ASDR, Yemen (1405/100,000), Mozambique (1149/100,000), and Mali (1093/100,000) had the highest rates in 2021. By 2050, Yemen (1388/100,000), Mali (1181/100,000), and Mozambique (1177/100,000) are expected to remain the highest. SHAP values demonstrated that gender was the leading predictor of ID, with age and year showing negative contributions. Females aged 10 to 60 consistently showed higher prevalence and DALYs rates compared to males, with the under-5 age group having the highest rates for both. Additionally, men aged 80 and above exhibited a rapid increase in prevalence. Furthermore, the ASPR and ASDR were significantly higher in regions with a lower SDI, highlighting the greater burden of ID in less developed regions. Conclusions: ID remains a significant global health concern, with its burden projected to persist through 2050, particularly in lower-SDI regions. Despite declines in ASPR and ASDR, total cases and DALYs are expected to rise. SHAP analysis revealed that gender had the greatest influence on the model's predictions, while both age and year showed overall negative contributions to ID risk. Children under 5, women under 60, and elderly men aged 80+ were the most vulnerable groups. These findings underscore the need for targeted interventions, such as improved nutrition, early screening, and addressing socioeconomic drivers through iron supplementation programs in low-SDI regions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. Development and validation of a machine learning model integrated with the clinical workflow for inpatient discharge date prediction.
- Author
-
Mahyoub, Mohammed A., Dougherty, Kacie, Yadav, Ravi R., Berio-Dorta, Raul, and Shukla, Ajit
- Subjects
MEDICAL care use ,PREDICTION models ,HUMAN services programs ,INTERPROFESSIONAL relations ,DISCHARGE planning ,DESCRIPTIVE statistics ,WORKFLOW ,ELECTRONIC health records ,MACHINE learning ,QUALITY assurance ,LENGTH of stay in hospitals ,INTEGRATED health care delivery ,SENSITIVITY & specificity (Statistics) ,ALGORITHMS - Abstract
Background: Discharge date prediction plays a crucial role in healthcare management, enabling efficient resource allocation and patient care planning. Accurate estimation of the discharge date can optimize hospital operations and facilitate better patient outcomes. Materials and methods: In this study, we employed a systematic approach to develop a discharge date prediction model. We collaborated closely with clinical experts to identify relevant data elements that contribute to the prediction accuracy. Feature engineering was used to extract predictive features from both structured and unstructured data sources. XGBoost, a powerful machine learning algorithm, was employed for the prediction task. Furthermore, the developed model was seamlessly integrated into a widely used Electronic Medical Record (EMR) system, ensuring practical usability. Results: The model achieved a performance surpassing baseline estimates by up to 35.68% in the F1-score. Post-deployment, the model demonstrated operational value by aligning with MS GMLOS and contributing to an 18.96% reduction in excess hospital days. Conclusions: Our findings highlight the effectiveness and potential value of the developed discharge date prediction model in clinical practice. By improving the accuracy of discharge date estimations, the model has the potential to enhance healthcare resource management and patient care planning. Additional research endeavors should prioritize the evaluation of the model's long-term applicability across diverse scenarios and the comprehensive analysis of its influence on patient outcomes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. A Machine Learning-Based Algorithm for the Prediction of Eigenfrequencies of Railway Bridges.
- Author
-
Grunert, Günther, Grunert, Damian, Behnke, Ronny, Schäfer, Sarah, Liu, Xiaohan, and Challagonda, Sandeep Reddy
- Subjects
- *
REGRESSION analysis , *STRUCTURAL health monitoring , *RAILROAD bridges , *RAILROAD safety measures , *EIGENFREQUENCIES - Abstract
As part of the development of advanced, data-driven methods for predictive maintenance of railway infrastructure, this paper analyzes and evaluates more realistic predictions of eigenfrequencies of railway bridges, also referred to as natural frequencies, based on a population of already assessed, measured existing bridges using regression techniques. For this purpose, Machine Learning (ML) techniques such as Polynomial Regression (PR), ANN and XGBoost are consistently evaluated and the application of the XGBoost algorithm is identified as the most suitable prediction model for these eigenfrequencies, usable for dynamic train-bridge interactions. The results of the post-processing are incorporated into the safety architecture for bridge verification (risk management). The presented data-based techniques are a steppingstone towards digitalization of structural health monitoring and offer safety and longevity of the railway bridges. Furthermore, the use of these methods can save costs that would be incurred by physical
in-situ measurements. The types of bridges analyzed with ML are Filler Beam Bridges (FBE), which outnumber other construction types of bridges in Germany (DB InfraGO AG). This methodology is applicable to any bridge type as long as sufficient data are gathered for training, validation and testing. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
47. Corporate bond coupon prediction based on deep learning.
- Author
-
Liu, Tongyi, Jia, Lifen, and Chen, Wei
- Subjects
- *
INTEREST rates , *OPTIMIZATION algorithms , *CORPORATE bonds , *BOND market , *MATURITY (Finance) - Abstract
As the second largest bond market in the world, China's bond market has attracted extensive attention in recent years. Given its importance in facilitating financing arrangements and informing investment decisions, accurate bond coupon prediction is valuable. This paper proposes an ensemble model combining TabNet, DeepFM, and XGBoost for predicting the coupons of investment-grade corporate bonds. Specifically, to optimize the hyperparameters of the proposed model, an improved butterfly optimization algorithm incorporating the concepts of good point sets, refraction opposition-based learning, switching probability adjustment, and Solis & Wets search strategies is developed. Extensive experiments using data on China's investment-grade corporate bonds demonstrate the superior performance of the proposed model in the accuracy of bond coupon predictions. Additionally, the importance of various features has been discussed. The results show that the base interest rate for valuation and term to maturity are important to bond coupon predictions obtained by the proposed model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. Next‐Gen Crop Monitoring: MTEG‐RTU Algorithm and UAV Synergy for Precise Disease Diagnosis.
- Author
-
S, Hemalatha and Jayachandran, Jai Jaganath Babu
- Subjects
- *
FOOD supply , *PLANT diseases , *MACHINE learning , *AGRICULTURE , *CLIMATE change - Abstract
ABSTRACT The rapidly changing climatic scenarios are highly favorable for the rising diseases that lead to increasing threats to food production and supply. Various scholars and scientists make long steps to hasten the process of making innovations in farming for managing these issues. In this context, UAV is applied for the purpose of managing and monitoring plant health. The abiotic stresses available in plant diagnosis through traditional strategies are highly labor‐intensive and unfit for large‐scale deployment. Conversely, UAVs designed with mobile sensors, multispectral, radar, and so on make them flexible, affordable, and more effective. Thus, this study proposes a novel meta ensemble transfer extreme gradient‐based random tactical unit (MTEG‐RTU) algorithm for diagnosing crop illnesses precisely. The proposed MTEG‐RTU methodology entails three methods such as transfer learning, adaptive boost, and meta‐ensemble, and the hyper parameters are tuned using random tactical unit algorithm. Healthier and disordered crop images gained from the crop disease dataset comprise 8000 images and are preprocessed. The more optimal features from the preprocessed images are learned through the ResNet method, and these features enter into the classification phase. Random tactical unit algorithm enhanced the performance by optimizing the hyperparameters of MTEG classifier. The experimental results conducted based on the various assessment components and validation dataset indicate that the developed method outperformed the other chosen models, achieving precision, recall, and accuracy of 98.5%, 97.9%, and 98.6%, respectively. The other achievements made by the model are offering technical guidance for conducting the precise diagnosis and treatment of plant pathologies with less time of 9 s. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. Modeling motorcycle crash-injury severity utilizing explainable data-driven approaches.
- Author
-
Se, Chamroeun, Sunkpho, Jirapon, Wipulanusat, Warit, Tantisevi, Kevin, Champahom, Thanapong, and Ratanavaraha, Vatanavongs
- Subjects
- *
ARTIFICIAL neural networks , *RECURRENT neural networks , *TRAFFIC signs & signals , *ROAD users , *SUPPORT vector machines , *MOTORCYCLING accidents - Abstract
Motorcycle crashes remain a significant public safety concern, requiring diverse analytical approaches to inform countermeasures. This study uses machine learning to analyze injury severity in crashes in Thailand from 2018 to 2020. Traditional and advanced models, including including random forest (RF), support vector machine (SVM), deep neural network (DNN), recurrent neural network (RNN), long short-term memory (LSTM), and eXtreme gradient boosting (XGBoost), were compared. Hyperparameter tuning via GridSearchCV optimized performance. XGBoost, with a tradeoff score of 105.65%, outperformed other models in predicting severe and fatal injuries. SHapley Additive exPlanations (SHAPs) identified significant risk factors including speeding, drunk driving, two-lane roads, unlit conditions, head-on and truck collisions, and nighttime crashes. Conversely, factors such as barrier medians, flashing traffic signals, sideswipes, rear-end crashes, and wet roads were associated with reduced severity. These findings suggest opportunities for integrated infrastructure improvements and expanded rider training and education programs to address behavioral risks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. Enhancing Cardiovascular Risk Prediction: Development of an Advanced Xgboost Model with Hospital-Level Random Effects.
- Author
-
Dong, Tim, Oronti, Iyabosola Busola, Sinha, Shubhra, Freitas, Alberto, Zhai, Bing, Chan, Jeremy, Fudulu, Daniel P., Caputo, Massimo, and Angelini, Gianni D.
- Subjects
- *
MACHINE learning , *RANDOM effects model , *REGRESSION analysis , *SAMPLE size (Statistics) , *CARDIAC surgery , *LOGISTIC regression analysis - Abstract
Background: Ensemble tree-based models such as Xgboost are highly prognostic in cardiovascular medicine, as measured by the Clinical Effectiveness Metric (CEM). However, their ability to handle correlated data, such as hospital-level effects, is limited. Objectives: The aim of this work is to develop a binary-outcome mixed-effects Xgboost (BME) model that integrates random effects at the hospital level. To ascertain how well the model handles correlated data in cardiovascular outcomes, we aim to assess its performance and compare it to fixed-effects Xgboost and traditional logistic regression models. Methods: A total of 227,087 patients over 17 years of age, undergoing cardiac surgery from 42 UK hospitals between 1 January 2012 and 31 March 2019, were included. The dataset was split into two cohorts: training/validation (n = 157,196; 2012–2016) and holdout (n = 69,891; 2017–2019). The outcome variable was 30-day mortality with hospitals considered as the clustering variable. The logistic regression, mixed-effects logistic regression, Xgboost and binary-outcome mixed-effects Xgboost (BME) were fitted to both standardized and unstandardized datasets across a range of sample sizes and the estimated prediction power metrics were compared to identify the best approach. Results: The exploratory study found high variability in hospital-related mortality across datasets, which supported the adoption of the mixed-effects models. Unstandardized Xgboost BME demonstrated marked improvements in prediction power over the Xgboost model at small sample size ranges, but performance differences decreased as dataset sizes increased. Generalized linear models (glms) and generalized linear mixed-effects models (glmers) followed similar results, with the Xgboost models also excelling at greater sample sizes. Conclusions: These findings suggest that integrating mixed effects into machine learning models can enhance their performance on datasets where the sample size is small. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.