Back to Search Start Over

A catalog of reference genomes from the human microbiome.

Authors :
Nelson KE
Weinstock GM
Highlander SK
Worley KC
Creasy HH
Wortman JR
Rusch DB
Mitreva M
Sodergren E
Chinwalla AT
Feldgarden M
Gevers D
Haas BJ
Madupu R
Ward DV
Birren BW
Gibbs RA
Methe B
Petrosino JF
Strausberg RL
Sutton GG
White OR
Wilson RK
Durkin S
Giglio MG
Gujja S
Howarth C
Kodira CD
Kyrpides N
Mehta T
Muzny DM
Pearson M
Pepin K
Pati A
Qin X
Yandava C
Zeng Q
Zhang L
Berlin AM
Chen L
Hepburn TA
Johnson J
McCorrison J
Miller J
Minx P
Nusbaum C
Russ C
Sykes SM
Tomlinson CM
Young S
Warren WC
Badger J
Crabtree J
Markowitz VM
Orvis J
Cree A
Ferriera S
Fulton LL
Fulton RS
Gillis M
Hemphill LD
Joshi V
Kovar C
Torralba M
Wetterstrand KA
Abouellleil A
Wollam AM
Buhay CJ
Ding Y
Dugan S
FitzGerald MG
Holder M
Hostetler J
Clifton SW
Allen-Vercoe E
Earl AM
Farmer CN
Liolios K
Surette MG
Xu Q
Pohl C
Wilczek-Boney K
Zhu D
Source :
Science (New York, N.Y.) [Science] 2010 May 21; Vol. 328 (5981), pp. 994-9.
Publication Year :
2010

Abstract

The human microbiome refers to the community of microorganisms, including prokaryotes, viruses, and microbial eukaryotes, that populate the human body. The National Institutes of Health launched an initiative that focuses on describing the diversity of microbial species that are associated with health and disease. The first phase of this initiative includes the sequencing of hundreds of microbial reference genomes, coupled to metagenomic sequencing from multiple body sites. Here we present results from an initial reference genome sequencing of 178 microbial genomes. From 547,968 predicted polypeptides that correspond to the gene complement of these strains, previously unidentified ("novel") polypeptides that had both unmasked sequence length greater than 100 amino acids and no BLASTP match to any nonreference entry in the nonredundant subset were defined. This analysis resulted in a set of 30,867 polypeptides, of which 29,987 (approximately 97%) were unique. In addition, this set of microbial genomes allows for approximately 40% of random sequences from the microbiome of the gastrointestinal tract to be associated with organisms based on the match criteria used. Insights into pan-genome analysis suggest that we are still far from saturating microbial species genetic data sets. In addition, the associated metrics and standards used by our group for quality assurance are presented.

Details

Language :
English
ISSN :
1095-9203
Volume :
328
Issue :
5981
Database :
MEDLINE
Journal :
Science (New York, N.Y.)
Publication Type :
Academic Journal
Accession number :
20489017
Full Text :
https://doi.org/10.1126/science.1183605