Back to Search
Start Over
Transformer‐based representation learning and multiple‐instance learning for cancer diagnosis exclusively from raw sequencing fragments of bisulfite‐treated plasma cell‐free DNA
- Source :
- Molecular Oncology, Vol 18, Iss 11, Pp 2755-2769 (2024)
- Publication Year :
- 2024
- Publisher :
- Wiley, 2024.
-
Abstract
- Early cancer diagnosis from bisulfite‐treated cell‐free DNA (cfDNA) fragments requires tedious data analytical procedures. Here, we present a deep‐learning‐based approach for early cancer interception and diagnosis (DECIDIA) that can achieve accurate cancer diagnosis exclusively from bisulfite‐treated cfDNA sequencing fragments. DECIDIA relies on transformer‐based representation learning of DNA fragments and weakly supervised multiple‐instance learning for classification. We systematically evaluate the performance of DECIDIA for cancer diagnosis and cancer type prediction on a curated dataset of 5389 samples that consist of colorectal cancer (CRC; n = 1574), hepatocellular cell carcinoma (HCC; n = 1181), lung cancer (n = 654), and non‐cancer control (n = 1980). DECIDIA achieved an area under the receiver operating curve (AUROC) of 0.980 (95% CI, 0.976–0.984) in 10‐fold cross‐validation settings on the CRC dataset by differentiating cancer patients from cancer‐free controls, outperforming benchmarked methods that are based on methylation intensities. Noticeably, DECIDIA achieved an AUROC of 0.910 (95% CI, 0.896–0.924) on the externally independent HCC testing set in distinguishing HCC patients from cancer‐free controls, although there was no HCC data used in model development. In the settings of cancer‐type classification, we observed that DECIDIA achieved a micro‐average AUROC of 0.963 (95% CI, 0.960–0.966) and an overall accuracy of 82.8% (95% CI, 81.8–83.9). In addition, we distilled four sequence signatures from the raw sequencing reads that exhibited differential patterns in cancer versus control and among different cancer types. Our approach represents a new paradigm towards eliminating the tedious data analytical procedures for liquid biopsy that uses bisulfite‐treated cfDNA methylome.
Details
- Language :
- English
- ISSN :
- 18780261 and 15747891
- Volume :
- 18
- Issue :
- 11
- Database :
- Directory of Open Access Journals
- Journal :
- Molecular Oncology
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.67456d0b304a818b85b00963dc4443
- Document Type :
- article
- Full Text :
- https://doi.org/10.1002/1878-0261.13745