Back to Search Start Over

A web-based tool for cancer risk prediction for middle-aged and elderly adults using machine learning algorithms and self-reported questions.

A web-based tool for cancer risk prediction for middle-aged and elderly adults using machine learning algorithms and self-reported questions.

Authors :
Xiao, Xingjian
Yi, Xiaohan
Soe, Nyi Nyi
Latt, Phyu Mon
Lin, Luotao
Chen, Xuefen
Song, Hualing
Sun, Bo
Zhao, Hailei
Xu, Xianglong
Source :
Annals of Epidemiology. Jan2025, Vol. 101, p27-35. 9p.
Publication Year :
2025

Abstract

From a global perspective, China is one of the countries with higher incidence and mortality rates for cancer. Our objective is to create an online cancer risk prediction tool for middle-aged and elderly Chinese adults by leveraging machine learning algorithms and self-reported data. Drawing from a cohort of 19,798 participants aged 45 and above from the China Health and Retirement Longitudinal Study (2011 - 2018), we employed nine machine learning algorithms (LR: Logistic Regression, Adaboost: Adaptive Boosting, SVM: Support Vector Machine, RF: Random Forest, GNB: Gaussian Naive Bayes, GBM: Gradient Boosting Machine, LGBM: Light Gradient Boosting Machine, XGBoost: eXtreme Gradient Boosting, KNN: K - Nearest Neighbors), which are mainly used for classification and regression tasks, to construct predictive models for various cancers. Utilizing non-invasive self-reported predictors encompassing demographic, educational, marital, lifestyle, health history, and other factors, we focused on predicting "Cancer or Malignant Tumour" outcomes. The types of cancers that can be predicted mainly include lung cancer, breast cancer, cervical cancer, colorectal cancer, gastric cancer, esophageal cancer, and other rare cancers. The developed tool, MyCancerRisk, demonstrated significant performance, with the Random Forest algorithm achieving an AUC of 0.75 and ACC of 0.99 using self-reported variables. Key predictors identified include age, self-rated health, sleep patterns, household heating sources, childhood health status, living conditions, and smoking habits. MyCancerRisk aims to serve as a preventative screening tool, encouraging individuals to undergo testing and adopt healthier behaviours to mitigate the public health impact of cancer. Our study also sheds light on unconventional predictors, such as housing conditions, offering valuable insights for refining cancer prediction models. [Display omitted] [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10472797
Volume :
101
Database :
Academic Search Index
Journal :
Annals of Epidemiology
Publication Type :
Academic Journal
Accession number :
182097638
Full Text :
https://doi.org/10.1016/j.annepidem.2024.12.003