Back to Search Start Over

The M5nr: a novel non-redundant database containing protein sequences and annotations from multiple sources and associated tools.

Authors :
Wilke, Andreas
Harrison, Travis
Wilkening, Jared
Field, Dawn
Glass, Elizabeth M.
Kyrpides, Nikos
Mavrommatis, Konstantinos
Meyer, Folker
Source :
BMC Bioinformatics. 2012, Vol. 13 Issue 1, p141-145. 5p. 3 Diagrams.
Publication Year :
2012

Abstract

Background: Computing of sequence similarity results is becoming a limiting factor in metagenome analysis. Sequence similarity search results encoded in an open, exchangeable format have the potential to limit the needs for computational reanalysis of these data sets. A prerequisite for sharing of similarity results is a common reference. Description: We introduce a mechanism for automatically maintaining a comprehensive, non-redundant protein database and for creating a quarterly release of this resource. In addition, we present tools for translating similarity searches into many annotation namespaces, e.g. KEGG or NCBI's GenBank. Conclusions: The data and tools we present allow the creation of multiple result sets using a single computation, permitting computational results to be shared between groups for large sequence data sets. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
14712105
Volume :
13
Issue :
1
Database :
Academic Search Index
Journal :
BMC Bioinformatics
Publication Type :
Academic Journal
Accession number :
79827159
Full Text :
https://doi.org/10.1186/1471-2105-13-141