Back to Search Start Over

A Machine Learning Approach to Identifying Causal Monogenic Variants in Inflammatory Bowel Disease.

Authors :
Mulder DJ
Khalouei S
Li M
Warner N
Gonzaga-Jauregui C
Benchimol EI
Church PC
Walters TD
Ramani AK
Griffiths AM
Ricciuto A
Muise AM
Source :
Gastro hep advances [Gastro Hep Adv] 2022 Feb 03; Vol. 1 (2), pp. 171-179. Date of Electronic Publication: 2022 Feb 03 (Print Publication: 2022).
Publication Year :
2022

Abstract

Background and Aims: Diagnosis of monogenic disease is increasingly important for patient care and personalizing therapy. However, the current process is nonstandardized, expensive, and time consuming. There is currently no accepted strategy to help identify disease-causing variants in monogenic inflammatory bowel disease (IBD). The aim of the study is to develop a prioritization strategy for monogenic IBD variant discovery through detailed analysis of a whole-exome sequencing (WES) data set.<br />Methods: All consenting pediatric patients with IBD presenting to our tertiary care hospital during the study period were enrolled and underwent WES (n = 1005). Available family members also underwent WES. Variants were analyzed en masse using the GEMINI framework and were further annotated using data from dbNSFP, Combined Annotation Dependent Depletion, and gnomAD. Known disease-causing variants (n = 36) were used as positive controls. Machine learning algorithms were optimized and then compared to assist with identifying monogenic IBD case characteristics.<br />Results: Initial gene-level analysis identified 11 genes not previously linked to IBD that could potentially harbor IBD-causing variants. Machine learning algorithms identified 4 primary variant characteristics (Combined Annotation Dependent Depletion score, dbNSFP score, relationship with a known immunodeficiency gene, and alternate allele frequency), and optimal threshold values for each were determined to assist with identifying monogenic IBD variants. Based on these characteristics, an automated variant prioritization pipeline was then created that filters and prioritizes variants from >100,000 variants per patient down to a mean of 15. This pipeline is available online for all to use.<br />Conclusion: Leveraging a large WES data set, we demonstrate a statistically rigorous strategy for prioritization of variants for monogenic IBD diagnosis.<br /> (© 2022 The Authors.)

Details

Language :
English
ISSN :
2772-5723
Volume :
1
Issue :
2
Database :
MEDLINE
Journal :
Gastro hep advances
Publication Type :
Academic Journal
Accession number :
39131125
Full Text :
https://doi.org/10.1016/j.gastha.2021.11.002