1. Cross-species analysis of enhancer logic using deep learning
- Author
-
David Mauduit, Giorgia Egidy, Ibrahim Ihsan Taskiran, Edouard Cadieu, Ghanem Elias Ghanem, Panagiotis Karras, Linde Van Aerschot, Jasper Wouters, Aline Primot, Gert Hulselmans, Monika Seltenhammer, Leonard I. Zon, Ellen van Rooijen, Valerie Christiaens, Samira Makhzami, Jean-Christophe Marine, Maurizio Fazio, Stein Aerts, Liesbeth Minnoye, Catholic University of Leuven - Katholieke Universiteit Leuven (KU Leuven), Dana-Farber Cancer Institute [Boston], Boston Children's Hospital, Medizinische Universität Wien = Medical University of Vienna, Institut de Génétique et Développement de Rennes (IGDR), Structure Fédérative de Recherche en Biologie et Santé de Rennes ( Biosit : Biologie - Santé - Innovation Technologique )-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES), Université Paris-Saclay, Université libre de Bruxelles (ULB), C14/18/092, KU Leuven, 2016-070, Fondation contre le Cancer, 1S03317N, Fonds Wetenschappelijk Onderzoek, Kom op tegen Kanker, Stand Up To Cancer, Flemish Cancer Society, Stichting tegen Kanker, Belgian Cancer Society, ANR-11-INBS-0003,CRB-Anim,Réseau de Centres de Ressources Biologiques pour les animaux domestiques(2011), European Project: 724226,cis-CONTROL, and Université de Rennes (UR)-Centre National de la Recherche Scientifique (CNRS)-Structure Fédérative de Recherche en Biologie et Santé de Rennes ( Biosit : Biologie - Santé - Innovation Technologique )
- Subjects
Swine ,[SDV]Life Sciences [q-bio] ,Method ,Sequence alignment ,Computational biology ,Biology ,Mice ,03 medical and health sciences ,Deep Learning ,Dogs ,0302 clinical medicine ,Genetics ,Animals ,Humans ,Nucleosome ,Horses ,Enhancer ,Melanoma ,Transcription factor ,Zebrafish ,Genetics (clinical) ,030304 developmental biology ,0303 health sciences ,Computational Biology ,Chromatin ,Gene Expression Regulation, Neoplastic ,DNA binding site ,Enhancer Elements, Genetic ,DECIPHER ,030217 neurology & neurosurgery ,IRF4 - Abstract
Deciphering the genomic regulatory code of enhancers is a key challenge in biology because this code underlies cellular identity. A better understanding of how enhancers work will improve the interpretation of noncoding genome variation and empower the generation of cell type-specific drivers for gene therapy. Here, we explore the combination of deep learning and cross-species chromatin accessibility profiling to build explainable enhancer models. We apply this strategy to decipher the enhancer code in melanoma, a relevant case study owing to the presence of distinct melanoma cell states. We trained and validated a deep learning model, called DeepMEL, using chromatin accessibility data of 26 melanoma samples across six different species. We show the accuracy of DeepMEL predictions on the CAGI5 challenge, where it significantly outperforms existing models on the melanoma enhancer of IRF4 Next, we exploit DeepMEL to analyze enhancer architectures and identify accurate transcription factor binding sites for the core regulatory complexes in the two different melanoma states, with distinct roles for each transcription factor, in terms of nucleosome displacement or enhancer activation. Finally, DeepMEL identifies orthologous enhancers across distantly related species, where sequence alignment fails, and the model highlights specific nucleotide substitutions that underlie enhancer turnover. DeepMEL can be used from the Kipoi database to predict and optimize candidate enhancers and to prioritize enhancer mutations. In addition, our computational strategy can be applied to other cancer or normal cell types. ispartof: GENOME RESEARCH vol:30 issue:12 ispartof: location:United States status: published
- Published
- 2020
- Full Text
- View/download PDF