Back to Search Start Over

RFPlasmid: predicting plasmid sequences from short-read assembly data using machine learning.

Authors :
van der Graaf-van Bloois L
Wagenaar JA
Zomer AL
Source :
Microbial genomics [Microb Genom] 2021 Nov; Vol. 7 (11).
Publication Year :
2021

Abstract

Antimicrobial-resistance (AMR) genes in bacteria are often carried on plasmids and these plasmids can transfer AMR genes between bacteria. For molecular epidemiology purposes and risk assessment, it is important to know whether the genes are located on highly transferable plasmids or in the more stable chromosomes. However, draft whole-genome sequences are fragmented, making it difficult to discriminate plasmid and chromosomal contigs. Current methods that predict plasmid sequences from draft genome sequences rely on single features, like k -mer composition, circularity of the DNA molecule, copy number or sequence identity to plasmid replication genes, all of which have their drawbacks, especially when faced with large single-copy plasmids, which often carry resistance genes. With our newly developed prediction tool RFPlasmid, we use a combination of multiple features, including k -mer composition and databases with plasmid and chromosomal marker proteins, to predict whether the likely source of a contig is plasmid or chromosomal. The tool RFPlasmid supports models for 17 different bacterial taxa, including Campylobacter , Escherichia coli and Salmonella , and has a taxon agnostic model for metagenomic assemblies or unsupported organisms. RFPlasmid is available both as a standalone tool and via a web interface.

Details

Language :
English
ISSN :
2057-5858
Volume :
7
Issue :
11
Database :
MEDLINE
Journal :
Microbial genomics
Publication Type :
Academic Journal
Accession number :
34846288
Full Text :
https://doi.org/10.1099/mgen.0.000683