Back to Search
Start Over
REPARATION: ribosome profiling assisted (re-)annotation of bacterial genomes
- Source :
- NUCLEIC ACIDS RESEARCH, Nucleic Acids Research
- Publication Year :
- 2017
- Publisher :
- Oxford University Press (OUP), 2017.
-
Abstract
- Prokaryotic genome annotation is highly dependent on automated methods, as manual curation cannot keep up with the exponential growth of sequenced genomes. Current automated methods depend heavily on sequence composition and often underestimate the complexity of the proteome. We developed RibosomeE Profiling Assisted (re-)AnnotaTION (REPARATION), a de novo machine learning algorithm that takes advantage of experimental protein synthesis evidence from ribosome profiling (Ribo-seq) to delineate translated open reading frames (ORFs) in bacteria, independent of genome annotation (https://github.com/Biobix/REPARATION). REPARATION evaluates all possible ORFs in the genome and estimates minimum thresholds based on a growth curve model to screen for spurious ORFs. We applied REPARATION to three annotated bacterial species to obtain a more comprehensive mapping of their translation landscape in support of experimental data. In all cases, we identified hundreds of novel (small) ORFs including variants of previously annotated ORFs and >70% of all (variants of) annotated protein coding ORFs were predicted by REPARATION to be translated. Our predictions are supported by matching mass spectrometry proteomics data, sequence composition and conservation analysis. REPARATION is unique in that it makes use of experimental translation evidence to intrinsically perform a de novo ORF delineation in bacterial genomes irrespective of the sequence features linked to open reading frames.
- Subjects :
- Salmonella typhimurium
0301 basic medicine
PROTEINS
PSEUDOGENES
Pseudogene
Computational biology
Bacterial genome size
Biology
BINDING-SITES
Genome
Machine Learning
Open Reading Frames
03 medical and health sciences
Annotation
REVEALS
Genetics
Ribosome profiling
ORFS
MICROBIAL GENE IDENTIFICATION
IN-VIVO
COMPLEXITY
Escherichia coli K12
Chromosome Mapping
Computational Biology
Biology and Life Sciences
Molecular Sequence Annotation
Genome project
NUCLEOTIDE RESOLUTION
SMALL ORFS
Open reading frame
030104 developmental biology
Proteome
Methods Online
TRANSLATION
Ribosomes
Algorithms
Genome, Bacterial
Bacillus subtilis
Subjects
Details
- ISSN :
- 13624962 and 03051048
- Volume :
- 45
- Database :
- OpenAIRE
- Journal :
- Nucleic Acids Research
- Accession number :
- edsair.doi.dedup.....085b88cab635c9b13fb01860548553a6