1. CRISPRCasdb a successor of CRISPRdb containing CRISPR arrays and cas genes from complete genome sequences, and tools to download and query lists of repeats and spacers
- Author
-
Jean-Philippe Vernadet, Christine Pourcel, David Couvin, Marie Touchon, Gilles Vergnaud, Nicolas Villeriot, Claire Toffano-Nioche, Institut de Biologie Intégrative de la Cellule (I2BC), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Génomique évolutive des Microbes / Microbial Evolutionary Genomics, Institut Pasteur [Paris] (IP)-Centre National de la Recherche Scientifique (CNRS), Unité Transmission, Réservoir et Diversité des Pathogènes [Pasteur Guadeloupe, France] (TReD-Path), Institut Pasteur de la Guadeloupe, Réseau International des Instituts Pasteur (RIIP)-Réseau International des Instituts Pasteur (RIIP), Institut Français de Bioinformatique (IFB) [ANR-11-INSB-0013]. Funding for open access charge: CNRS., Institut Pasteur [Paris]-Centre National de la Recherche Scientifique (CNRS), Séquence, Structure et Fonction des ARN (SSFA), Département Biologie des Génomes (DBG), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Institut de Biologie Intégrative de la Cellule (I2BC), and Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Transposable element ,MESH: CRISPR-Cas Systems ,MESH: Genome, Archaeal ,[SDV]Life Sciences [q-bio] ,CRISPR-Associated Proteins ,Computational biology ,MESH: Genome, Bacterial ,Genome ,03 medical and health sciences ,MESH: Software ,Plasmid ,Phylogenetics ,Genome, Archaeal ,Databases, Genetic ,Genetics ,CRISPR ,Database Issue ,Clustered Regularly Interspaced Short Palindromic Repeats ,MESH: Phylogeny ,Gene ,MESH: Databases, Genetic ,Phylogeny ,030304 developmental biology ,0303 health sciences ,biology ,Bacteria ,030306 microbiology ,MESH: CRISPR-Associated Proteins ,Palindrome ,Prokaryote ,biology.organism_classification ,[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM] ,Archaea ,MESH: Bacteria ,MESH: Archaea ,MESH: Clustered Regularly Interspaced Short Palindromic Repeats ,CRISPR-Cas Systems ,Genome, Bacterial ,Software - Abstract
In Archaea and Bacteria, the arrays called CRISPRs for ‘clustered regularly interspaced short palindromic repeats’ and the CRISPR associated genes or cas provide adaptive immunity against viruses, plasmids and transposable elements. Short sequences called spacers, corresponding to fragments of invading DNA, are stored in-between repeated sequences. The CRISPR–Cas systems target sequences homologous to spacers leading to their degradation. To facilitate investigations of CRISPRs, we developed 12 years ago a website holding the CRISPRdb. We now propose CRISPRCasdb, a completely new version giving access to both CRISPRs and cas genes. We used CRISPRCasFinder, a program that identifies CRISPR arrays and cas genes and determine the system's type and subtype, to process public whole genome assemblies. Strains are displayed either in an alphabetic list or in taxonomic order. The database is part of the CRISPR-Cas++ website which also offers the possibility to analyse submitted sequences and to download programs. A BLAST search against lists of repeats and spacers extracted from the database is proposed. To date, 16 990 complete prokaryote genomes (16 650 bacteria from 2973 species and 340 archaea from 300 species) are included. CRISPR–Cas systems were found in 36% of Bacteria and 75% of Archaea strains. CRISPRCasdb is freely accessible at https://crisprcas.i2bc.paris-saclay.fr/.
- Published
- 2019