Start Over

Application of machine learning for high-throughput tumor marker screening.

Authors :: Fu, Xingxing
Ma, Wanting
Zuo, Qi
Qi, Yanfei
Zhang, Shubiao
Zhao, Yinan
Source :: Life Sciences. Jul2024, Vol. 348, pN.PAG-N.PAG. 1p.
Publication Year :: 2024
Abstract: High-throughput sequencing and multiomics technologies have allowed increasing numbers of biomarkers to be mined and used for disease diagnosis, risk stratification, efficacy assessment, and prognosis prediction. However, the large number and complexity of tumor markers make screening them a substantial challenge. Machine learning (ML) offers new and effective ways to solve the screening problem. ML goes beyond mere data processing and is instrumental in recognizing intricate patterns within data. ML also has a crucial role in modeling dynamic changes associated with diseases. Used together, ML techniques have been included in automatic pipelines for tumor marker screening, thereby enhancing the efficiency and accuracy of the screening process. In this review, we discuss the general processes and common ML algorithms, and highlight recent applications of ML in tumor marker screening of genomic, transcriptomic, proteomic, and metabolomic data of patients with various types of cancers. Finally, the challenges and future prospects of the application of ML in tumor therapy are discussed. Machine learning can be applied to tumor marker screening in genomics, transcriptomics, proteomics and metabolomics. Common tumor markers are derived from circulating tumor cells (CTC), cell-free DNA (cfDNA), circulating tumor DNA (ctDNA), microRNA (miRNA), messenger RNA (mRNA), circular RNA (circRNA), long noncoding RNA (lncRNA), extracellular vesicle (EV), circulating tumor proteins, and metabolites (e.g., glycogen, branched-chain amino acids (BCAA), fatty acids (FA)). Commonly used algorithms in tumor marker screening are least absolute shrinkage and selection operator (LASSO), logistic regression (LR), decision tree (DT), random forest (RF), support vector machine (SVM), gradient boosting (GBoost), extreme gradient boosting (XGBoost), neural network (NN), deep neural network (DNN), hierarchical cluster analysis (HCA), principal component analysis (PCA), and t-distributed stochastic neighbor embedding (t-SNE). [Display omitted] • Tumor markers help in disease diagnosis and prognosis assessment. • Machine leaning can analyze a large amount of tumor-related data. • Machine leaning greatly improves the screening efficiency and accuracy. [ABSTRACT FROM AUTHOR]