Back to Search
Start Over
Transformer-based representation learning and multiple-instance learning for cancer diagnosis exclusively from raw sequencing fragments of bisulfite-treated plasma cell-free DNA.
- Source :
-
Molecular oncology [Mol Oncol] 2024 Nov; Vol. 18 (11), pp. 2755-2769. Date of Electronic Publication: 2024 Oct 08. - Publication Year :
- 2024
-
Abstract
- Early cancer diagnosis from bisulfite-treated cell-free DNA (cfDNA) fragments requires tedious data analytical procedures. Here, we present a deep-learning-based approach for early cancer interception and diagnosis (DECIDIA) that can achieve accurate cancer diagnosis exclusively from bisulfite-treated cfDNA sequencing fragments. DECIDIA relies on transformer-based representation learning of DNA fragments and weakly supervised multiple-instance learning for classification. We systematically evaluate the performance of DECIDIA for cancer diagnosis and cancer type prediction on a curated dataset of 5389 samples that consist of colorectal cancer (CRC; n = 1574), hepatocellular cell carcinoma (HCC; n = 1181), lung cancer (n = 654), and non-cancer control (n = 1980). DECIDIA achieved an area under the receiver operating curve (AUROC) of 0.980 (95% CI, 0.976-0.984) in 10-fold cross-validation settings on the CRC dataset by differentiating cancer patients from cancer-free controls, outperforming benchmarked methods that are based on methylation intensities. Noticeably, DECIDIA achieved an AUROC of 0.910 (95% CI, 0.896-0.924) on the externally independent HCC testing set in distinguishing HCC patients from cancer-free controls, although there was no HCC data used in model development. In the settings of cancer-type classification, we observed that DECIDIA achieved a micro-average AUROC of 0.963 (95% CI, 0.960-0.966) and an overall accuracy of 82.8% (95% CI, 81.8-83.9). In addition, we distilled four sequence signatures from the raw sequencing reads that exhibited differential patterns in cancer versus control and among different cancer types. Our approach represents a new paradigm towards eliminating the tedious data analytical procedures for liquid biopsy that uses bisulfite-treated cfDNA methylome.<br /> (© 2024 The Author(s). Molecular Oncology published by John Wiley & Sons Ltd on behalf of Federation of European Biochemical Societies.)
Details
- Language :
- English
- ISSN :
- 1878-0261
- Volume :
- 18
- Issue :
- 11
- Database :
- MEDLINE
- Journal :
- Molecular oncology
- Publication Type :
- Academic Journal
- Accession number :
- 39380154
- Full Text :
- https://doi.org/10.1002/1878-0261.13745