Back to Search Start Over

Machine learning for identifying benign and malignant of thyroid tumors: A retrospective study of 2,423 patients

Authors :
Yuan-yuan Guo
Zhi-jie Li
Chao Du
Jun Gong
Pu Liao
Jia-xing Zhang
Cong Shao
Source :
Frontiers in Public Health, Vol 10 (2022)
Publication Year :
2022
Publisher :
Frontiers Media S.A., 2022.

Abstract

Thyroid tumors, one of the common tumors in the endocrine system, while the discrimination between benign and malignant thyroid tumors remains insufficient. The aim of this study is to construct a diagnostic model of benign and malignant thyroid tumors, in order to provide an emerging auxiliary diagnostic method for patients with thyroid tumors. The patients were selected from the Chongqing General Hospital (Chongqing, China) from July 2020 to September 2021. And peripheral blood, BRAFV600E gene, and demographic indicators were selected, including sex, age, BRAFV600E gene, lymphocyte count (Lymph#), neutrophil count (Neu#), neutrophil/lymphocyte ratio (NLR), platelet/lymphocyte ratio (PLR), red blood cell distribution width (RDW), platelets count (PLT), red blood cell distribution width—coefficient of variation (RDW–CV), alkaline phosphatase (ALP), and parathyroid hormone (PTH). First, feature selection was executed by univariate analysis combined with least absolute shrinkage and selection operator (LASSO) analysis. Afterward, we used machine learning algorithms to establish three types of models. The first model contains all predictors, the second model contains indicators after feature selection, and the third model contains patient peripheral blood indicators. The four machine learning algorithms include extreme gradient boosting (XGBoost), random forest (RF), light gradient boosting machine (LightGBM), and adaptive boosting (AdaBoost) which were used to build predictive models. A grid search algorithm was used to find the optimal parameters of the machine learning algorithms. A series of indicators, such as the area under the curve (AUC), were intended to determine the model performance. A total of 2,042 patients met the criteria and were enrolled in this study, and 12 variables were included. Sex, age, Lymph#, PLR, RDW, and BRAFV600E were identified as statistically significant indicators by univariate and LASSO analysis. Among the model we constructed, RF, XGBoost, LightGBM and AdaBoost with the AUC of 0.874 (95% CI, 0.841–0.906), 0.868 (95% CI, 0.834–0.901), 0.861 (95% CI, 0.826–0.895), and 0.837 (95% CI, 0.802–0.873) in the first model. With the AUC of 0.853 (95% CI, 0.818–0.888), 0.853 (95% CI, 0.818–0.889), 0.837 (95% CI, 0.800–0.873), and 0.832 (95% CI, 0.797–0.867) in the second model. With the AUC of 0.698 (95% CI, 0.651–0.745), 0.688 (95% CI, 0.639–0.736), 0.693 (95% CI, 0.645–0.741), and 0.666 (95% CI, 0.618–0.714) in the third model. Compared with the existing models, our study proposes a model incorporating novel biomarkers which could be a powerful and promising tool for predicting benign and malignant thyroid tumors.

Details

Language :
English
ISSN :
22962565
Volume :
10
Database :
Directory of Open Access Journals
Journal :
Frontiers in Public Health
Publication Type :
Academic Journal
Accession number :
edsdoj.bc059ec0de4a029e04bc967b8c980b
Document Type :
article
Full Text :
https://doi.org/10.3389/fpubh.2022.960740