David Peris, Emily J. Ubbelohde, Meihua Christina Kuang, Jacek Kominek, Quinn K. Langdon, Marie Adams, Justin A. Koshalek, Amanda Beth Hulfachor, Dana A. Opulente, David J. Hall, Katie Hyma, Justin C. Fay, Jean-Baptiste Leducq, Guillaume Charron, Christian R. Landry, Diego Libkind, Carla Gonçalves, Paula Gonçalves, José Paulo Sampaio, Qi-Ming Wang, Feng-Yan Bai, Russel L. Wrobel, Chris Todd Hittinger, European Commission, Research Council of Norway, Generalitat Valenciana, National Science Foundation (US), Great Lakes Bioenergy Research Center (US), National Natural Science Foundation of China, UCIBIO - Applied Molecular Biosciences Unit, and DCV - Departamento de Ciências da Vida
Species is the fundamental unit to quantify biodiversity. In recent years, the model yeast Saccharomyces cerevisiae has seen an increased number of studies related to its geographical distribution, population structure, and phenotypic diversity. However, seven additional species from the same genus have been less thoroughly studied, which has limited our understanding of the macroevolutionary events leading to the diversification of this genus over the last 20 million years. Here, we show the geographies, hosts, substrates, and phylogenetic relationships for approximately 1,800 Saccharomyces strains, covering the complete genus with unprecedented breadth and depth. We generated and analyzed complete genome sequences of 163 strains and phenotyped 128 phylogenetically diverse strains. This dataset provides insights about genetic and phenotypic diversity within and between species and populations, quantifies reticulation and incomplete lineage sorting, and demonstrates how gene flow and selection have affected traits, such as galactose metabolism. These findings elevate the genus Saccharomyces as a model to understand biodiversity and evolution in microbial eukaryotes., Some computations were performed on Tirant III of the Spanish Supercomputing Network (“Servei d’Informàtica de la Universitat de València”) under the project BCV-2021-1-0001 granted to DP, while others were performed at the Wisconsin Energy Institute and the Center for High-Throughput Computing of the University of Wisconsin–Madison. During a portion of this project, DP was a researcher funded by the European Union’s Horizon 2020 research and innovation program Marie Sklodowska-Curie, grant agreement No. 747775, the Research Council of Norway (RCN) grant Nos. RCN 324253 and 274337, and the Generalitat Valenciana plan GenT grant No. CIDEGENT/2021/039. D.P. is a recipient of an Illumina Grant for Illumina Sequencing Saccharomyces strains in this study. Q.K.L. was supported by the National Science Foundation under Grant No. DGE-1256259 (Graduate Research Fellowship) and the Predoctoral Training Program in Genetics, funded by the National Institutes of Health (5T32GM007133). This material is based upon work supported in part by the Great Lakes Bioenergy Research Center, Office of Science, Office of Biological and Environmental Research under Award Numbers DE-SC0018409 and DE-FC02-07ER64494; the National Science Foundation under Grant Nos. DEB-1253634, DEB−1442148, and DEB-2110403; and the USDA National Institute of Food and Agriculture Hatch Project Number 1020204. C.T.H. is an H. I. Romnes Faculty Fellow, supported by the Office of the Vice Chancellor for Research and Graduate Education with funding from the Wisconsin Alumni Research Foundation. QMW was supported by the National Natural Science Foundation of China (NSFC) under Grant Nos. 31770018 and 31961133020. C.R.L. holds the Canada Research Chair in Cellular Systems and Synthetic Biology, and his research on wild yeast is supported by an NSERC Discovery Grant.