Thierry Tonon, Patrick Wincker, Corinne Da Silva, Tristan Barbeyron, Tsinda Rukwavu, Jeremy Szymczak, Benjamin Noel, Ruibo Cai, Stephane Rombauts, Adriana Alberti, Erwan Corre, Catharina Alves-de-Souza, Ehsan Kayal, Laure Guillou, Florian Maumus, Karine Labadie, Estelle Bigeard, Pierre Rouzé, Benjamin Istace, Sarah Farhat, Jean-Marc Aury, Phuong Le, Yves Van de Peer, Isabelle Florent, Dominique Marie, Betina M. Porcel, Jonathan Mercier, Génomique métabolique (UMR 8030), Genoscope - Centre national de séquençage [Evry] (GENOSCOPE), Université Paris-Saclay-Direction de Recherche Fondamentale (CEA) (DRF (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-Direction de Recherche Fondamentale (CEA) (DRF (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Centre National de la Recherche Scientifique (CNRS)-Université d'Évry-Val-d'Essonne (UEVE), School of Marine and Atmospheric Sciences [Stony Brook] (SoMAS), Stony Brook University [SUNY] (SBU), State University of New York (SUNY)-State University of New York (SUNY), Center for Plant Systems Biology (PSB Center), Vlaams Instituut voor Biotechnologie [Ghent, Belgique] (VIB), ABiMS - Informatique et bioinformatique = Analysis and Bioinformatics for Marine Science (FR2424), Station biologique de Roscoff (SBR), Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), ECOlogy of MArine Plankton (ECOMAP), Adaptation et diversité en milieu marin (AD2M), Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS)-Station biologique de Roscoff (SBR), Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS), Unité de Recherche Génomique Info (URGI), Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), Molécules de Communication et Adaptation des Micro-organismes (MCAM), Muséum national d'Histoire naturelle (MNHN)-Centre National de la Recherche Scientifique (CNRS), Laboratoire de Biologie Intégrative des Modèles Marins (LBI2M), Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Station biologique de Roscoff (SBR), Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS), University of York [York, UK], University of North Carolina [Wilmington] (UNC), University of North Carolina System (UNC), Department of Biochemistry, Genetics and Microbiology [Pretoria], University of Pretoria [South Africa], ANR (Agence Nationale de la Recherche) Grant ANR-14-CE02-0007 HAPAR, the CEA and the Région Bretagne (RC doctoral grant ARED PARASITE 9450 and EK postdoctoral grant SAD HAPAR 9229), and the CNRS (X-life SEAgOInG)., ANR-14-CE02-0007,HAPAR,Le paradoxe de la spécialisation chez un parasite de microalgues responsables de marées rouges(2014), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université d'Évry-Val-d'Essonne (UEVE)-Centre National de la Recherche Scientifique (CNRS), ABiMS - Informatique et bioinformatique = Analysis and Bioinformatics for Marine Science (ABIMS), Fédération de recherche de Roscoff (FR2424), Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Station biologique de Roscoff (SBR), Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS), Adaptation et diversité en milieu marin (ADMM), Institut national des sciences de l'Univers (INSU - CNRS)-Station biologique de Roscoff (SBR), and Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Institut national des sciences de l'Univers (INSU - CNRS)-Station biologique de Roscoff (SBR)
Background Dinoflagellates are aquatic protists particularly widespread in the oceans worldwide. Some are responsible for toxic blooms while others live in symbiotic relationships, either as mutualistic symbionts in corals or as parasites infecting other protists and animals. Dinoflagellates harbor atypically large genomes (~ 3 to 250 Gb), with gene organization and gene expression patterns very different from closely related apicomplexan parasites. Here we sequenced and analyzed the genomes of two early-diverging and co-occurring parasitic dinoflagellate Amoebophrya strains, to shed light on the emergence of such atypical genomic features, dinoflagellate evolution, and host specialization. Results We sequenced, assembled, and annotated high-quality genomes for two Amoebophrya strains (A25 and A120), using a combination of Illumina paired-end short-read and Oxford Nanopore Technology (ONT) MinION long-read sequencing approaches. We found a small number of transposable elements, along with short introns and intergenic regions, and a limited number of gene families, together contribute to the compactness of the Amoebophrya genomes, a feature potentially linked with parasitism. While the majority of Amoebophrya proteins (63.7% of A25 and 59.3% of A120) had no functional assignment, we found many orthologs shared with Dinophyceae. Our analyses revealed a strong tendency for genes encoded by unidirectional clusters and high levels of synteny conservation between the two genomes despite low interspecific protein sequence similarity, suggesting rapid protein evolution. Most strikingly, we identified a large portion of non-canonical introns, including repeated introns, displaying a broad variability of associated splicing motifs never observed among eukaryotes. Those introner elements appear to have the capacity to spread over their respective genomes in a manner similar to transposable elements. Finally, we confirmed the reduction of organelles observed in Amoebophrya spp., i.e., loss of the plastid, potential loss of a mitochondrial genome and functions. Conclusion These results expand the range of atypical genome features found in basal dinoflagellates and raise questions regarding speciation and the evolutionary mechanisms at play while parastitism was selected for in this particular unicellular lineage.