1. Predicting stimulation-dependent enhancer-promoter interactions from chip-seq time course data
- Author
-
Filomena Matarese, Henk Stunnenberg, George Reid, Korbinian Grote, Antti Honkela, Iryna Charapitsa, Tomasz Dzida, Magnus Rattray, Mudassar Iqbal, Department of Mathematics and Statistics, Helsinki Institute for Information Technology, Probabilistic Mechanistic Models for Genomics research group / Antti Honkela, Department of Computer Science, and Department of Public Health
- Subjects
0301 basic medicine ,Bioinformatics ,lcsh:Medicine ,RNA polymerase II ,Biology ,Enhancer-promoter interaction ,General Biochemistry, Genetics and Molecular Biology ,ChIP-Seq ,03 medical and health sciences ,chemistry.chemical_compound ,GENE PROMOTERS ,Transcription (biology) ,RNA polymerase ,Bayesian classifier ,Machine learning ,BREAST-CANCER ,Estrogen receptor ,TRANSCRIPTION ,CHROMOSOME CONFORMATION ,Enhancer ,112 Statistics and probability ,Molecular Biology ,Gene ,GeneralLiterature_REFERENCE(e.g.,dictionaries,encyclopedias,glossaries) ,CHROMATIN INTERACTIONS ,Genetics ,General Neuroscience ,lcsh:R ,Computational Biology ,Promoter ,BETA-GLOBIN LOCUS ,Genomics ,General Medicine ,113 Computer and information sciences ,GENOME ,HUMAN CELL-TYPES ,030104 developmental biology ,HI-C ,chemistry ,biology.protein ,H3K4me3 ,General Agricultural and Biological Sciences ,Estrogen receptor alpha ,HISTONE MODIFICATIONS - Abstract
We have developed a machine learning approach to predict stimulation-dependent enhancer-promoter interactions using evidence from changes in genomic protein occupancy over time. The occupancy of estrogen receptor alpha (ERα), RNA polymerase (Pol II) and histone marks H2AZ and H3K4me3 were measured over time using ChIP-Seq experiments in MCF7 cells stimulated with estrogen. A Bayesian classifier was developed which uses the correlation of temporal binding patterns at enhancers and promoters and genomic proximity as features to predict interactions. This method was trained using experimentally determined interactions from the same system and was shown to achieve much higher precision than predictions based on the genomic proximity of nearest ERα binding. We use the method to identify a genome-wide confident set of ERα target genes and their regulatory enhancers genome-wide. Validation with publicly available GRO-Seq data demonstrates that our predicted targets are much more likely to show early nascent transcription than predictions based on genomic ERα binding proximity alone.
- Published
- 2017