Back to Search Start Over

Personalized three-year survival prediction and prognosis forecast by interpretable machine learning for pancreatic cancer patients: a population-based study and an external validation

Authors :
Buwei Teng
Xiaofeng Zhang
Mingshu Ge
Miao Miao
Wei Li
Jun Ma
Source :
Frontiers in Oncology, Vol 14 (2024)
Publication Year :
2024
Publisher :
Frontiers Media S.A., 2024.

Abstract

PurposeThe overall survival of patients with pancreatic cancer is extremely low. We aimed to establish machine learning (ML) based model to accurately predict three-year survival and prognosis of pancreatic cancer patients.MethodsWe analyzed pancreatic cancer patients from the Surveillance, Epidemiology, and End Results (SEER) database between 2000 and 2021. Univariate and multivariate logistic analysis were employed to select variables. Recursive Feature Elimination (RFE) method based on 6 ML algorithms was utilized in feature selection. To construct predictive model, 13 ML algorithms were evaluated by area under the curve (AUC), area under precision-recall curve (PRAUC), accuracy, sensitivity, specificity, precision, cross-entropy, Brier scores and Balanced Accuracy (bacc) and F Beta Score (fbeta). An optimal ML model was constructed to predict three-year survival, and the predictive results were explained by SHapley Additive exPlanations (SHAP) framework. Meanwhile, 101 ML algorithm combinations were developed to select the best model with highest C-index to predict prognosis of pancreatic cancer patients.ResultsA total of 20,064 pancreatic cancer patients from SEER database was consecutively enrolled. We utilized eight clinical variables to establish prediction model for three-year survival. CatBoost model was selected as the best prediction model, and AUC was 0.932 [0.924, 0.939], 0.899 [0.873, 0.934] and 0.826 [0.735, 0.919] in training, internal test and external test sets, with 0.839 [0.831, 0.847] accuracy, 0.872 [0.858, 0.887] sensitivity, 0.803 [0.784, 0.825] specificity and 0.832 [0.821, 0.853] precision. Surgery type had the greatest effects on three-year survival according to SHAP results. For prognosis prediction, “RSF+GBM” algorithm was the best prognostic model with C-index of 0.774, 0.722 and 0.674 in training, internal test and external test sets.ConclusionsOur ML models demonstrate excellent accuracy and reliability, offering more precise personalized prognostic prediction to pancreatic cancer patients.

Details

Language :
English
ISSN :
2234943X
Volume :
14
Database :
Directory of Open Access Journals
Journal :
Frontiers in Oncology
Publication Type :
Academic Journal
Accession number :
edsdoj.1d0ee0bddfb2481eaf69c381abdc6b0d
Document Type :
article
Full Text :
https://doi.org/10.3389/fonc.2024.1488118