Back to Search
Start Over
Bayesian nonparametric discovery of isoforms and individual specific quantification
- Source :
- Nature Communications, Nature Communications, Vol 9, Iss 1, Pp 1-12 (2018)
- Publication Year :
- 2017
-
Abstract
- Most human protein-coding genes can be transcribed into multiple distinct mRNA isoforms. These alternative splicing patterns encourage molecular diversity, and dysregulation of isoform expression plays an important role in disease etiology. However, isoforms are difficult to characterize from short-read RNA-seq data because they share identical subsequences and occur in different frequencies across tissues and samples. Here, we develop biisq, a Bayesian nonparametric model for isoform discovery and individual specific quantification from short-read RNA-seq data. biisq does not require isoform reference sequences but instead estimates an isoform catalog shared across samples. We use stochastic variational inference for efficient posterior estimates and demonstrate superior precision and recall for simulations compared to state-of-the-art isoform reconstruction methods. biisq shows the most gains for low abundance isoforms, with 36% more isoforms correctly inferred at low coverage versus a multi-sample method and 170% more versus single-sample methods. We estimate isoforms in the GEUVADIS RNA-seq data and validate inferred isoforms by associating genetic variants with isoform ratios.<br />Alternative splicing leads to transcript isoform diversity. Here, Aguiar et al. develop biisq, a Bayesian nonparametric approach to discover and quantify isoforms from RNA-seq data.
- Subjects :
- 0301 basic medicine
Gene isoform
Science
genetic processes
General Physics and Astronomy
Inference
Datasets as Topic
Computational biology
Biology
Quantitative Biology - Quantitative Methods
General Biochemistry, Genetics and Molecular Biology
Statistics, Nonparametric
Article
Bayesian nonparametrics
03 medical and health sciences
Exon
0302 clinical medicine
Humans
Protein Isoforms
natural sciences
Computer Simulation
RNA, Messenger
lcsh:Science
Gene
Quantitative Methods (q-bio.QM)
Multidisciplinary
Sequence Analysis, RNA
Gene Expression Profiling
Alternative splicing
Bayes Theorem
General Chemistry
Alternative Splicing
030104 developmental biology
FOS: Biological sciences
RNA splicing
MRNA Isoforms
lcsh:Q
Transcriptome
human activities
030217 neurology & neurosurgery
Software
Subjects
Details
- ISSN :
- 20411723
- Volume :
- 9
- Issue :
- 1
- Database :
- OpenAIRE
- Journal :
- Nature communications
- Accession number :
- edsair.doi.dedup.....8e21a307b1d3498d6cfd271834105f26