Back to Search Start Over

Whole metagenome analysis with metagWGS

Authors :
Fourquet, Joanna
Noirot, Céline
Klopp, Christophe
Pinton, Philippe
Combes, Sylvie
Hoede, Claire
Pascal, Géraldine
Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE)
Pascal, Géraldine
Source :
JOBIM2020, JOBIM2020, Jun 2020, Montpellier, France
Publication Year :
2020
Publisher :
HAL CCSD, 2020.

Abstract

International audience; Whole DNA shotgun sequencing of environmental samples allows to study their taxonomic composition and their functional profiles. However, the biological process from collecting data to sequencing and bioinformatics analysis are still very tricky [1].We are developing a complete, scalable, easy-to-use and reproducible workflow, MetagWGS, with Nextflow [2] and Singularity [3] that processes short Illumina reads from shotgun metagenomics data. It delivers (i) contig assemblies, (ii) syntactic and functional annotations of genes, (iii) taxonomic affiliations of reads and contigs, (iv) count table of reads per genes and (v) contig binning to obtain metagenome species.The workflow begins by preprocessing steps that clean adapters, low quality reads and the host reads. We control the quality of the reads with FastQC [4]. The taxonomic classification of reads uses Kaiju [5] in order to have a first overview of reads. The assembly step uses metaSPAdes [6] or megahit [7] to generate contigs for each sample. These contigs are annotated by Prokka [8]. Then, with CD-HIT [9] we remove redundancy and generate a gene catalog by clustering ORFs at sample level and globally with a 95% sequence identity cutoff. We map reads back to contigs and we use featureCounts [10] to count the reads overlapping annotated genes. The raw count table gathers the number of reads aligned on each gene for each sample. We use DIAMOND [11] for the taxonomic affiliation of contigs versus nr database. We include contig binning processes from nf-core/mag pipeline. We generate a single result report with MultiQC [12].MetagWGS is available on https://forgemia.inra.fr/genotoul-bioinfo/metagwgs. We will apply it on sequences from ExpoMycoPig project that aims to study gut microbiota of pigs exposed to mycotoxins [13].

Details

Language :
English
Database :
OpenAIRE
Journal :
JOBIM2020, JOBIM2020, Jun 2020, Montpellier, France
Accession number :
edsair.dedup.wf.001..d4f6fe8365c8c044874a02e7acaa277f