1. Machine learning approach to predict blood-secretory proteins and potential biomarkers for liver cancer using omics data.
- Author
-
Paul D, Sinnarasan VSP, Das R, Sheikh MMR, and Venkatesan A
- Subjects
- Humans, Blood Proteins analysis, Blood Proteins genetics, Blood Proteins metabolism, Glypicans blood, Glypicans genetics, Neoplasm Proteins blood, Neoplasm Proteins genetics, Gene Expression Profiling, Liver Neoplasms blood, Liver Neoplasms genetics, Liver Neoplasms diagnosis, Liver Neoplasms metabolism, Machine Learning, Biomarkers, Tumor blood, Biomarkers, Tumor genetics
- Abstract
Identifying non-invasive blood-based biomarkers is crucial for early detection and monitoring of liver cancer (LC), thereby improving patient outcomes. This study leveraged computational approaches to predict potential blood-based biomarkers for LC. Machine learning (ML) models were developed using selected features from blood-secretory proteins collected from the curated databases. The logistic regression (LR) model demonstrated the optimal performance. Transcriptome analysis across 7 LC cohorts revealed 231 common differentially expressed genes (DEGs). The encoded proteins of these DEGs were compared with the ML dataset, revealing 29 proteins overlapping with the blood-secretory dataset. The LR model also predicted 29 additional proteins as blood-secretory with the remaining protein-coding genes. As a result, 58 potential blood-secretory proteins were obtained. Among the top 20 genes, 13 common hub genes were identified. Further, area under the receiver operating characteristic curve (ROC AUC) analysis was performed to assess the genes as potential diagnostic blood biomarkers. Six genes, ESM1, FCN2, MDK, GPC3, CTHRC1 and COL6A6, exhibited an AUC value higher than 0.85 and were predicted as blood-secretory. This study highlights the potential of an integrative computational approach for discovering non-invasive blood-based biomarkers in LC, facilitating for further validation and clinical translation. SIGNIFICANCE: Liver cancer is one of the leading causes of premature death worldwide, with its prevalence and mortality rates projected to increase. Although current diagnostic methods are highly sensitive, they are invasive and unsuitable for repeated testing. Blood biomarkers offer a promising non-invasive alternative, but their wide dynamic range of protein concentration poses experimental challenges. Therefore, utilizing available omics data to develop a diagnostic model could provide a potential solution for accurate diagnosis. This study developed a computational method integrating machine learning and bioinformatics analysis to identify potential blood biomarkers. As a result, ESM1, FCN2, MDK, GPC3, CTHRC1 and COL6A6 biomarkers were identified, holding significant promise for improving diagnosis and understanding of liver cancer. The integrated method can be applied to other cancers, offering a possible solution for early detection and improved patient outcomes., Competing Interests: Declaration of competing interest The authors declare they have no conflicts of interest., (Copyright © 2024. Published by Elsevier B.V.)
- Published
- 2024
- Full Text
- View/download PDF