1. LeafCutterMD: an algorithm for outlier splicing detection in rare diseases
- Author
-
Yang I. Li, Eric W. Klee, Garrett Jenkinson, Gavin R. Oliver, Shubham Basu, and Margot A. Cousin
- Subjects
Statistics and Probability ,AcademicSubjects/SCI01060 ,Computer science ,RNA Splicing ,Gene Expression ,Context (language use) ,Computational biology ,Biochemistry ,Genome ,Manual curation ,03 medical and health sciences ,symbols.namesake ,0302 clinical medicine ,Rare Diseases ,Humans ,Aberrant splicing ,Molecular Biology ,Gene ,Exome sequencing ,030304 developmental biology ,0303 health sciences ,business.industry ,Sequence Analysis, RNA ,Spliced Genes ,RNA ,High-Throughput Nucleotide Sequencing ,Original Papers ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,RNA splicing ,Outlier ,Mendelian inheritance ,symbols ,Personalized medicine ,business ,030217 neurology & neurosurgery ,Algorithms ,Software ,Rare disease - Abstract
Motivation Next-generation sequencing is rapidly improving diagnostic rates in rare Mendelian diseases, but even with whole genome or whole exome sequencing, the majority of cases remain unsolved. Increasingly, RNA sequencing is being used to solve many cases that evade diagnosis through sequencing alone. Specifically, the detection of aberrant splicing in many rare disease patients suggests that identifying RNA splicing outliers is particularly useful for determining causal Mendelian disease genes. However, there is as yet a paucity of statistical methodologies to detect splicing outliers. Results We developed LeafCutterMD, a new statistical framework that significantly improves the previously published LeafCutter in the context of detecting outlier splicing events. Through simulations and analysis of real patient data, we demonstrate that LeafCutterMD has better power than the state-of-the-art methodology while controlling false-positive rates. When applied to a cohort of disease-affected probands from the Mayo Clinic Center for Individualized Medicine, LeafCutterMD recovered all aberrantly spliced genes that had previously been identified by manual curation efforts. Availability and implementation The source code for this method is available under the opensource Apache 2.0 license in the latest release of the LeafCutter software package available online at http://davidaknowles.github.io/leafcutter. Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2020