Back to Search Start Over

Tetranucleotide usage highlights genomic heterogeneity among mycobacteriophages [version 2; referees: 2 approved]

Authors :
Benjamin Siranosian
Sudheesha Perera
Edward Williams
Chen Ye
Christopher de Graffenried
Peter Shank
Author Affiliations :
<relatesTo>1</relatesTo>Center for Computational Molecular Biology, Brown University, Providence, RI, 02912, USA<br /><relatesTo>2</relatesTo>Division of Biology and Medicine, Brown University, Providence, RI, 02912, USA<br /><relatesTo>3</relatesTo>Department of Molecular Microbiology and Immunology, Brown University, Providence, RI, 02912, USA
Source :
F1000Research. 4:36
Publication Year :
2015
Publisher :
London, UK: F1000 Research Limited, 2015.

Abstract

Background The genomic sequences of mycobacteriophages, phages infecting mycobacterial hosts, are diverse and mosaic. Mycobacteriophages often share little nucleotide similarity, but most of them have been grouped into lettered clusters and further into subclusters. Traditionally, mycobacteriophage genomes are analyzed based on sequence alignment or knowledge of gene content. However, these approaches are computationally expensive and can be ineffective for significantly diverged sequences. As an alternative to alignment-based genome analysis, we evaluated tetranucleotide usage in mycobacteriophage genomes. These methods make it easier to characterize features of the mycobacteriophage population at many scales. Description We computed tetranucleotide usage deviation (TUD), the ratio of observed counts of 4-mers in a genome to the expected count under a null model. TUD values are comparable between members of a phage subcluster and distinct between subclusters. With few exceptions, neighbor joining phylogenetic trees and hierarchical clustering dendrograms constructed using TUD values place phages in a monophyletic clade with members of the same subcluster. Regions in a genome with exceptional TUD values can point to interesting features of genomic architecture. Finally, we found that subcluster B3 mycobacteriophages contain significantly overrepresented 4-mers and 6-mers that are atypical of phage genomes. Conclusions Statistics based on tetranucleotide usage support established clustering of mycobacteriophages and can uncover interesting relationships within and between sequenced phage genomes. These methods are efficient to compute and do not require sequence alignment or knowledge of gene content. The code to download mycobacteriophage genome sequences and reproduce our analysis is freely available at https://github.com/bsiranosian/tango_final.

Details

ISSN :
20461402
Volume :
4
Database :
F1000Research
Journal :
F1000Research
Notes :
Revised Amendments from Version 1 This version addresses the review by Dr. Bonham-Carter. Changes have been made to make the methods section more clear, and I have included an example figure to show the calculation of TUD on a small sequence. The results from the paper remain unchanged., , [version 2; referees: 2 approved]
Publication Type :
Academic Journal
Accession number :
edsfor.10.12688.f1000research.6077.2
Document Type :
research-article
Full Text :
https://doi.org/10.12688/f1000research.6077.2