Back to Search Start Over

DAIRYdb: a manually curated reference database for improved taxonomy annotation of 16S rRNA gene sequences from dairy products

Authors :
Shani, Noam
Delbes, Céline
Berthoud, Hélène
Chassard, Christophe
Meola, Marco
Rifa, Etienne
Source :
5. Colloque de Génomique Environnementale, La Rochelle, FRA, 2019-10-08-2019-10-10
Publication Year :
2019

Abstract

New sequencing technologies allowed the development of methods such as metabarcoding. They use 16S rRNA, the ubiquitous gene of the bacteria domain, as a biomarker to study bacterial communities from samples of complex environments. The bioinformatic pipelines used to handle this type of data are well known but the taxonomic assignment is still a critical point. Misannotations are caused by short size sequences and high identity between bacterial species. Indeed, current technologies only allow to sequence two hyper-variable regions among the nine which compose the 16S rRNA. The available generalist databases (Greengenes, SILVA) do not reach enough accuracy to assign to the species rank. It is needed to create a curated reference database dedicated to the studied environment. Here we introduce the DAIRYdb, a manually curated database composed of full length 16S rRNA sequences from samples of dairy products and close environments (cheese, milk, teat surface, starter, whey). The DAIRYdb was constructed using the 16S rRNA sequences deposited in EMBL and NCBI. After bioinformatic treatments, automatic and manual curative steps, DAIRYdb is finally composed of 10290 complete 16S sequences. It shows a higher assignment accuracy compared to other databases, at all taxonomic ranks and with any assignment tool tested. Depending on the variable region used, up to 90% of the tested sequences are reassigned to the species level with DAIRYdb. DAIRYdb significantly improves taxonomic assignment accuracy for dairy environmental microbiome studies. It is available on public a repository (https://forgemia.inra.fr/umrf/dairydb).

Details

Language :
English
Database :
OpenAIRE
Journal :
5. Colloque de Génomique Environnementale, La Rochelle, FRA, 2019-10-08-2019-10-10
Accession number :
edsair.od......1582..6f842c82bceea4f28046c8c11fa313ba