Back to Search Start Over

Transformer-based representation learning and multiple-instance learning for cancer diagnosis exclusively from raw sequencing fragments of bisulfite-treated plasma cell-free DNA.

Authors :
Liu J
Shen H
Yang Y
Yang M
Zhang Q
Chen K
Li X
Source :
Molecular oncology [Mol Oncol] 2024 Nov; Vol. 18 (11), pp. 2755-2769. Date of Electronic Publication: 2024 Oct 08.
Publication Year :
2024

Abstract

Early cancer diagnosis from bisulfite-treated cell-free DNA (cfDNA) fragments requires tedious data analytical procedures. Here, we present a deep-learning-based approach for early cancer interception and diagnosis (DECIDIA) that can achieve accurate cancer diagnosis exclusively from bisulfite-treated cfDNA sequencing fragments. DECIDIA relies on transformer-based representation learning of DNA fragments and weakly supervised multiple-instance learning for classification. We systematically evaluate the performance of DECIDIA for cancer diagnosis and cancer type prediction on a curated dataset of 5389 samples that consist of colorectal cancer (CRC; n = 1574), hepatocellular cell carcinoma (HCC; n = 1181), lung cancer (n = 654), and non-cancer control (n = 1980). DECIDIA achieved an area under the receiver operating curve (AUROC) of 0.980 (95% CI, 0.976-0.984) in 10-fold cross-validation settings on the CRC dataset by differentiating cancer patients from cancer-free controls, outperforming benchmarked methods that are based on methylation intensities. Noticeably, DECIDIA achieved an AUROC of 0.910 (95% CI, 0.896-0.924) on the externally independent HCC testing set in distinguishing HCC patients from cancer-free controls, although there was no HCC data used in model development. In the settings of cancer-type classification, we observed that DECIDIA achieved a micro-average AUROC of 0.963 (95% CI, 0.960-0.966) and an overall accuracy of 82.8% (95% CI, 81.8-83.9). In addition, we distilled four sequence signatures from the raw sequencing reads that exhibited differential patterns in cancer versus control and among different cancer types. Our approach represents a new paradigm towards eliminating the tedious data analytical procedures for liquid biopsy that uses bisulfite-treated cfDNA methylome.<br /> (© 2024 The Author(s). Molecular Oncology published by John Wiley & Sons Ltd on behalf of Federation of European Biochemical Societies.)

Details

Language :
English
ISSN :
1878-0261
Volume :
18
Issue :
11
Database :
MEDLINE
Journal :
Molecular oncology
Publication Type :
Academic Journal
Accession number :
39380154
Full Text :
https://doi.org/10.1002/1878-0261.13745