Back to Search Start Over

Development of a Machine Learning Model to Distinguish between Ulcerative Colitis and Crohn's Disease Using RNA Sequencing Data.

Authors :
Park SK
Kim S
Lee GY
Kim SY
Kim W
Lee CW
Park JL
Choi CH
Kang SB
Kim TO
Bang KB
Chun J
Cha JM
Im JP
Ahn KS
Kim SY
Park DI
Source :
Diagnostics (Basel, Switzerland) [Diagnostics (Basel)] 2021 Dec 15; Vol. 11 (12). Date of Electronic Publication: 2021 Dec 15.
Publication Year :
2021

Abstract

Crohn's disease (CD) and ulcerative colitis (UC) can be difficult to differentiate. As differential diagnosis is important in establishing a long-term treatment plan for patients, we aimed to develop a machine learning model for the differential diagnosis of the two diseases using RNA sequencing (RNA-seq) data from endoscopic biopsy tissue from patients with inflammatory bowel disease ( n = 127; CD, 94; UC, 33). Biopsy samples were taken from inflammatory lesions or normal tissues. The RNA-seq dataset was processed via mapping to the human reference genome (GRCh38) and quantifying the corresponding gene models that comprised 19,596 protein-coding genes. An unsupervised learning model showed distinct clusters of four classes: CD inflammatory, CD normal, UC inflammatory, and UC normal. A supervised learning model based on partial least squares discriminant analysis was able to distinguish inflammatory CD from inflammatory UC after pruning the strong classifiers of normal CD vs. normal UC. The error rate was minimal and affected only two components: 20 and 50 genes for the first and second components, respectively. The corresponding overall error rate was 0.147. RNA-seq analysis of tissue and the two components revealed in this study may be helpful for distinguishing CD from UC.

Details

Language :
English
ISSN :
2075-4418
Volume :
11
Issue :
12
Database :
MEDLINE
Journal :
Diagnostics (Basel, Switzerland)
Publication Type :
Academic Journal
Accession number :
34943601
Full Text :
https://doi.org/10.3390/diagnostics11122365