Back to Search
Start Over
iROS-gPseKNC: Predicting replication origin sites in DNA by incorporating dinucleotide position-specific propensity into general pseudo nucleotide composition
- Source :
- Oncotarget
- Publication Year :
- 2016
- Publisher :
- Impact Journals, LLC, 2016.
-
Abstract
- // Xuan Xiao 1, 2, 5 , Han-Xiao Ye 1 , Zi Liu 3 , Jian-Hua Jia 1 , Kuo-Chen Chou 4, 5 1 Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen, 333403, China 2 Information School, ZheJiang Textile and Fashion College, NingBo, 315211, China 3 School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China 4 Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah, 21589, Saudi Arabia 5 Gordon Life Science Institute, Boston, Massachusetts, 02478, USA Correspondence to: Xuan Xiao, email: xxiao@gordonlifescience.org , jdzxiaoxuan@163.com Keywords: origin of replication, position-specific dinucleotide propensity, general pseudo nucleotide composition, random forest, iROS-gPseKNC Received: March 02, 2016 Accepted: April 09, 2016 Published: April 27, 2016 ABSTRACT DNA replication, occurring in all living organisms and being the basis for biological inheritance, is the process of producing two identical replicas from one original DNA molecule. To in-depth understand such an important biological process and use it for developing new strategy against genetics diseases, the knowledge of duplication origin sites in DNA is indispensible. With the explosive growth of DNA sequences emerging in the postgenomic age, it is highly desired to develop high throughput tools to identify these regions purely based on the sequence information alone. In this paper, by incorporating the dinucleotide position-specific propensity information into the general pseudo nucleotide composition and using the random forest classifier, a new predictor called iROS-gPseKNC was proposed. Rigorously cross–validations have indicated that the proposed predictor is significantly better than the best existing method in sensitivity, specificity, overall accuracy, and stability. Furthermore, a user-friendly web-server for iROS-gPseKNC has been established at http://www.jci-bioinfo.cn/iROS-gPseKNC , by which users can easily get their desired results without the need to bother the complicated mathematics, which were presented just for the integrity of the methodology itself.
- Subjects :
- 0301 basic medicine
Center of excellence
position-specific dinucleotide propensity
Replication Origin
Computational biology
general pseudo nucleotide composition
Biology
Origin of replication
DNA sequencing
03 medical and health sciences
chemistry.chemical_compound
Replication (statistics)
Animals
Humans
Genetics
iROS-gPseKNC
DNA replication
Inheritance (genetic algorithm)
Computational Biology
High-Throughput Nucleotide Sequencing
Random forest
origin of replication
030104 developmental biology
Oncology
chemistry
Databases, Nucleic Acid
Software
random forest
DNA
Research Paper
Subjects
Details
- ISSN :
- 19492553
- Volume :
- 7
- Database :
- OpenAIRE
- Journal :
- Oncotarget
- Accession number :
- edsair.doi.dedup.....1c390b9c01b25643ccdaee62675227ab