Back to Search
Start Over
Detecting DNA modifications from SMRT sequencing data by modeling sequence context dependence of polymerase kinetic
- Source :
- PLoS Computational Biology, Vol 9, Iss 3, p e1002935 (2013), PLoS Computational Biology
- Publication Year :
- 2013
- Publisher :
- Public Library of Science (PLoS), 2013.
-
Abstract
- DNA modifications such as methylation and DNA damage can play critical regulatory roles in biological systems. Single molecule, real time (SMRT) sequencing technology generates DNA sequences as well as DNA polymerase kinetic information that can be used for the direct detection of DNA modifications. We demonstrate that local sequence context has a strong impact on DNA polymerase kinetics in the neighborhood of the incorporation site during the DNA synthesis reaction, allowing for the possibility of estimating the expected kinetic rate of the enzyme at the incorporation site using kinetic rate information collected from existing SMRT sequencing data (historical data) covering the same local sequence contexts of interest. We develop an Empirical Bayesian hierarchical model for incorporating historical data. Our results show that the model could greatly increase DNA modification detection accuracy, and reduce requirement of control data coverage. For some DNA modifications that have a strong signal, a control sample is not even needed by using historical data as alternative to control. Thus, sequencing costs can be greatly reduced by using the model. We implemented the model in a R package named seqPatch, which is available at https://github.com/zhixingfeng/seqPatch.<br />Author Summary DNA modifications have been found in a wide range of living organisms, from bacteria to human. Many existing studies have shown that they play important roles in development, disease, bacteria virulence, etc. However, for many types of DNA modification, for example N6-methyladenine and 8-oxoG, there is not an efficient and accurate detection method. Single molecule real time (SMRT) sequencing not only generates DNA sequences, but also generates DNA polymerase kinetic information. The kinetic information is sensitive to DNA modifications in the sequenced DNA template, and therefore can be used for detecting a wide range of DNA modification types. The usual detection strategy is a case-control method, which compare kinetic information between native sample and a control sample whose modifications have been removed. However, generating a control sample doubles the cost. We proposed a hierarchical model, which can incorporate existing SMRT sequencing data to increase detection accuracy and reduce coverage requirement of control sample or even avoid the need of a control sample in some cases. We tested our method on SMRT sequencing data of plasmids with known modified sites and E. coli K-12 strain to demonstrate our method can greatly increase detection accuracy and reduce sequencing cost.
- Subjects :
- Epigenomics
DNA, Bacterial
DNA polymerase
DNA damage
Context (language use)
Computational biology
DNA-Directed DNA Polymerase
Genome Complexity
DNA sequencing
Cellular and Molecular Neuroscience
chemistry.chemical_compound
Genetics
Escherichia coli
Genome Sequencing
Molecular Biology
Biology
lcsh:QH301-705.5
Ecology, Evolution, Behavior and Systematics
Polymerase
Ecology
biology
Models, Genetic
Computational Biology
Bayes Theorem
Genomics
Sequence Analysis, DNA
DNA Methylation
DNA extraction
Functional Genomics
Kinetics
Computational Theory and Mathematics
chemistry
lcsh:Biology (General)
Modeling and Simulation
biology.protein
Nucleic Acid Conformation
Sequence Analysis
DNA
Single molecule real time sequencing
Research Article
Subjects
Details
- Language :
- English
- ISSN :
- 15537358
- Volume :
- 9
- Issue :
- 3
- Database :
- OpenAIRE
- Journal :
- PLoS Computational Biology
- Accession number :
- edsair.doi.dedup.....b42bf16ce0f95584479a3c858a0617c8