Gand, Mathieu, Navickaite, Indre, Bartsch, Lee-Julia, Grützke, Josephine, Overballe-Petersen, Søren, Rasmussen, Astrid, Otani, Saria, Michelacci, Valeria, Rodríguez Matamoros, Bosco, González-Zorn, Bruno, Brouwer, Michael S. M., Di Marcantonio, Lisa, Bloemen, Bram, Vanneste, Kevin, Roosens, Nancy H. C. J., AbuOun, Manal, and De Keersmaecker, Sigrid C. J.
Metagenomic sequencing is a promising method that has the potential to revolutionize the world of pathogen detection and antimicrobial resistance (AMR) surveillance in food-producing environments. However, the analysis of the huge amount of data obtained requires performant bioinformatics tools and databases, with intuitive and straightforward interpretation. In this study, based on long-read metagenomics data of chicken fecal samples with a spike-in mock community, we proposed confidence levels for taxonomic identification and AMR gene detection, with interpretation guidelines, to help with the analysis of the output data generated by KMA, a popular k-mer read alignment tool. Additionally, we demonstrated that the completeness and diversity of the genomes present in the reference databases are key parameters for accurate and easy interpretation of the sequencing data. Finally, we explored whether KMA, in a two-step procedure, can be used to link the detected AMR genes to their bacterial host chromosome, both detected within the same long-reads. The confidence levels were successfully tested on 28 metagenomics datasets which were obtained with sequencing of real and spiked samples from fecal (chicken, pig, and buffalo) or food (minced beef and food enzyme products) origin. The methodology proposed in this study will facilitate the analysis of metagenomics sequencing datasets for KMA users. Ultimately, this will contribute to improvements in the rapid diagnosis and surveillance of pathogens and AMR genes in food-producing environments, as prioritized by the EU. [ABSTRACT FROM AUTHOR]