Back to Search Start Over

BMix: probabilistic modeling of occurring substitutions in PAR-CLIP data.

Authors :
Golumbeanu, Monica
Mohammadi, Pejman
Beerenwinkel, Niko
Source :
Bioinformatics. 4/1/2016, Vol. 32 Issue 7, p976-983. 8p.
Publication Year :
2016

Abstract

Motivation: Photoactivatable ribonucleoside-enhanced cross-linking and immunoprecipitation (PAR-CLIP) is an experimental method based on next-generation sequencing for identifying the RNA interaction sites of a given protein. The method deliberately inserts T-to-C substitutions at the RNA-protein interaction sites, which provides a second layer of evidence compared with other CLIP methods. However, the experiment includes several sources of noise which cause both lowfrequency errors and spurious high-frequency alterations. Therefore, rigorous statistical analysis is required in order to separate true T-to-C base changes, following cross-linking, from noise. So far, most of the existing PAR-CLIP data analysis methods focus on discarding the low-frequency errors and rely on high-frequency substitutions to report binding sites, not taking into account the possibility of high-frequency false positive substitutions. Results: Here, we introduce BMix, a new probabilistic method which explicitly accounts for the sources of noise in PAR-CLIP data and distinguishes cross-link induced T-to-C substitutions from low and high-frequency erroneous alterations. We demonstrate the superior speed and accuracy of our method compared with existing approaches on both simulated and real, publicly available human datasets. Availability and implementation: The model is freely accessible within the BMix toolbox at www. cbg.bsse.ethz.ch/software/BMix, available for Matlab and R. Supplementary information: Supplementary data is available at Bioinformatics online. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13674803
Volume :
32
Issue :
7
Database :
Academic Search Index
Journal :
Bioinformatics
Publication Type :
Academic Journal
Accession number :
114040797
Full Text :
https://doi.org/10.1093/bioinformatics/btv520