1. Rapid and sensitive detection of genome contamination at scale with FCS-GX
- Author
-
Alexander Astashyn, Eric S. Tvedte, Deacon Sweeney, Victor Sapojnikov, Nathan Bouk, Victor Joukov, Eyal Mozes, Pooja K. Strope, Pape M. Sylla, Lukas Wagner, Shelby L. Bidwell, Larissa C. Brown, Karen Clark, Emily W. Davis, Brian Smith-White, Wratko Hlavina, Kim D. Pruitt, Valerie A. Schneider, and Terence D. Murphy
- Subjects
Genome contamination ,Genome quality ,Genome assembly ,GenBank ,RefSeq ,Software ,Biology (General) ,QH301-705.5 ,Genetics ,QH426-470 - Abstract
Abstract Assembled genome sequences are being generated at an exponential rate. Here we present FCS-GX, part of NCBI’s Foreign Contamination Screen (FCS) tool suite, optimized to identify and remove contaminant sequences in new genomes. FCS-GX screens most genomes in 0.1–10 min. Testing FCS-GX on artificially fragmented genomes demonstrates high sensitivity and specificity for diverse contaminant species. We used FCS-GX to screen 1.6 million GenBank assemblies and identified 36.8 Gbp of contamination, comprising 0.16% of total bases, with half from 161 assemblies. We updated assemblies in NCBI RefSeq to reduce detected contamination to 0.01% of bases. FCS-GX is available at https://github.com/ncbi/fcs/ or https://doi.org/10.5281/zenodo.10651084 .
- Published
- 2024
- Full Text
- View/download PDF