Inge Holm, Luisa Nardini, Adrien Pain, Emmanuel Bischoff, Cameron E. Anderson, Soumanaba Zongo, Wamdaogo M. Guelbeogo, N’Fale Sagnon, Daryl M. Gohl, Ronald J. Nowling, Kenneth D. Vernick, Michelle M. Riehle, Génétique et Génomique des Insectes Vecteurs - Genetics and Genomics of Insect Vectors, Institut Pasteur [Paris] (IP)-Centre National de la Recherche Scientifique (CNRS)-Université Paris Cité (UPCité), Hub Bioinformatique et Biostatistique - Bioinformatics and Biostatistics HUB, Institut Pasteur [Paris] (IP)-Université Paris Cité (UPCité), Medical College of Wisconsin [Milwaukee] (MCW), Ministère de la Santé [Burkina Faso], University of Minnesota [Twin Cities] (UMN), University of Minnesota System, Milwaukee School of Engineering (MSOE), This work received financial support to KV from the European Commission, Horizon 2020 Infrastructures #731060 Infravec2, European Research Council, Support for frontier research, Advanced Grant #323173 AnoPath, Agence Nationale de la Recherche, #ANR-19-CE35-0004 ArboVec, National Institutes of Health, NIAID #AI145999, and French Laboratoire d’Excellence 'Integrative Biology of Emerging Infectious Diseases' #ANR-10-LABX-62-IBEID, to RN from National Science Foundation, IIS#194727, and to MR from National Institutes of Health, NIAID #AI121587, National Institutes of Health, NIAID #AI145999., ANR-19-CE35-0004,ArboVEC,Barrières d'hôtes dans la spécificité des interactions entre moustiques vecteurs et arbovirus(2019), ANR-10-LABX-0062,IBEID,Integrative Biology of Emerging Infectious Diseases(2010), European Project: 731060,INFRAVEC2(2017), and European Project: 323173,EC:FP7:ERC,ERC-2012-ADG_20120314,ANOPATH(2013)
Almost all regulation of gene expression in eukaryotic genomes is mediated by the action of distant non-coding transcriptional enhancers upon proximal gene promoters. Enhancer locations cannot be accurately predicted bioinformatically because of the absence of a defined sequence code, and thus functional assays are required for their direct detection. Here we used a massively parallel reporter assay, Self-Transcribing Active Regulatory Region sequencing (STARR-seq), to generate the first comprehensive genome-wide map of enhancers in Anopheles coluzzii, a major African malaria vector in the Gambiae species complex. The screen was carried out by transfecting reporter libraries created from the genomic DNA of 60 wild A. coluzzii from Burkina Faso into A. coluzzii 4a3A cells, in order to functionally query enhancer activity of the natural population within the homologous cellular context. We report a catalog of 3,288 active genomic enhancers that were significant across three biological replicates, 74% of them located in intergenic and intronic regions. The STARR-seq enhancer screen is chromatin-free and thus detects inherent activity of a comprehensive catalog of enhancers that may be restricted in vivo to specific cell types or developmental stages. Testing of a validation panel of enhancer candidates using manual luciferase assays confirmed enhancer function in 26 of 28 (93%) of the candidates over a wide dynamic range of activity from two to at least 16-fold activity above baseline. The enhancers occupy only 0.7% of the genome, and display distinct composition features. The enhancer compartment is significantly enriched for 15 transcription factor binding site signatures, and displays divergence for specific dinucleotide repeats, as compared to matched non-enhancer genomic controls. The genome-wide catalog of A. coluzzii enhancers is publicly available in a simple searchable graphic format. This enhancer catalogue will be valuable in linking genetic and phenotypic variation, in identifying regulatory elements that could be employed in vector manipulation, and in better targeting of chromosome editing to minimize extraneous regulation influences on the introduced sequences.Importance: Understanding the role of the non-coding regulatory genome in complex disease phenotypes is essential, but even in well-characterized model organisms, identification of regulatory regions within the vast non-coding genome remains a challenge. We used a large-scale assay to generate a genome wide map of transcriptional enhancers. Such a catalogue for the important malaria vector, Anopheles coluzzii, will be an important research tool as the role of non-coding regulatory variation in differential susceptibility to malaria infection is explored and as a public resource for research on this important insect vector of disease.