Back to Search Start Over

Accurate annotation of protein-coding genes in mitochondrial genomes

Authors :
Al-Arab, Marwa
Höner zu Siederdissen, Christian
Tout, Kifah R.
Sahyoun, Abdullah H.
Stadler, Peter F.
Bernt, Matthias
Publica
Publication Year :
2017

Abstract

Mitochondrial genome sequences are available in large number and new sequences become published nowadays with increasing pace. Fast, automatic, consistent, and high quality annotations are a prerequisite for downstream analyses. Therefore, we present an automated pipeline for fast de novo annotation of mitochondrial protein-coding genes. The annotation is based on enhanced phylogeny-aware hidden Markov models (HMMs). The pipeline builds taxon-specific enhanced multiple sequence alignments (MSA) of already annotated sequences and corresponding HMMs using an approximation of the phylogeny. The MSAs are enhanced by fixing unannotated frameshifts, purging of wrong sequences, and removal of non-conserved columns from both ends. A comparison with reference annotations highlights the high quality of the results. The frameshift correction method predicts a large number of frameshifts, many of which are unknown. A detailed analysis of the frameshifts in nad3 of the Archosauria-Testudines group has been conducted.

Details

Language :
English
Database :
OpenAIRE
Accession number :
edsair.od.......610..6210965315e391fa44801fb4aaf20bc4