8 results on '"Jill E. Moore"'
Search Results
2. A curated benchmark of enhancer-gene interactions for evaluating enhancer-target gene prediction methods
- Author
-
Jill E. Moore, Henry E. Pratt, Michael J. Purcaro, and Zhiping Weng
- Subjects
Enhancer ,Transcriptional regulation ,Target gene ,Benchmark ,Machine learning ,Genomic interactions ,Biology (General) ,QH301-705.5 ,Genetics ,QH426-470 - Abstract
Abstract Background Many genome-wide collections of candidate cis-regulatory elements (cCREs) have been defined using genomic and epigenomic data, but it remains a major challenge to connect these elements to their target genes. Results To facilitate the development of computational methods for predicting target genes, we develop a Benchmark of candidate Enhancer-Gene Interactions (BENGI) by integrating the recently developed Registry of cCREs with experimentally derived genomic interactions. We use BENGI to test several published computational methods for linking enhancers with genes, including signal correlation and the TargetFinder and PEP supervised learning methods. We find that while TargetFinder is the best-performing method, it is only modestly better than a baseline distance method for most benchmark datasets when trained and tested with the same cell type and that TargetFinder often does not outperform the distance method when applied across cell types. Conclusions Our results suggest that current computational methods need to be improved and that BENGI presents a useful framework for method development and testing.
- Published
- 2020
- Full Text
- View/download PDF
3. Building integrative functional maps of gene regulation
- Author
-
Jinrui Xu, Henry E Pratt, Jill E Moore, Mark B Gerstein, and Zhiping Weng
- Subjects
Gene Expression Regulation ,Genome, Human ,Genetics ,Humans ,Chromosome Mapping ,DNA ,General Medicine ,Regulatory Sequences, Nucleic Acid ,Molecular Biology ,Genetics (clinical) - Abstract
Every cell in the human body inherits a copy of the same genetic information. The three billion base pairs of DNA in the human genome, and the roughly 50 000 coding and non-coding genes they contain, must thus encode all the complexity of human development and cell and tissue type diversity. Differences in gene regulation, or the modulation of gene expression, enable individual cells to interpret the genome differently to carry out their specific functions. Here we discuss recent and ongoing efforts to build gene regulatory maps, which aim to characterize the regulatory roles of all sequences in a genome. Many researchers and consortia have identified such regulatory elements using functional assays and evolutionary analyses; we discuss the results, strengths and shortcomings of their approaches. We also discuss new techniques the field can leverage and emerging challenges it will face while striving to build gene regulatory maps of ever-increasing resolution and comprehensiveness.
- Published
- 2022
- Full Text
- View/download PDF
4. The ENCODE4 long-read RNA-seq collection reveals distinct classes of transcript structure diversity
- Author
-
Fairlie Reese, Brian Williams, Gabriela Balderrama-Gutierrez, Dana Wyman, Muhammed Hasan Çelik, Elisabeth Rebboah, Narges Rezaie, Diane Trout, Milad Razavi-Mohseni, Yunzhe Jiang, Beatrice Borsari, Samuel Morabito, Heidi Yahan Liang, Cassandra J. McGill, Sorena Rahmanian, Jasmine Sakr, Shan Jiang, Weihua Zeng, Klebea Carvalho, Annika K. Weimer, Louise A. Dionne, Ariel McShane, Karan Bedi, Shaimae I. Elhajjajy, Sean Upchurch, Jennifer Jou, Ingrid Youngworth, Idan Gabdank, Paul Sud, Otto Jolanki, J. Seth Strattan, Meenakshi S. Kagda, Michael P. Snyder, Ben C. Hitz, Jill E. Moore, Zhiping Weng, David Bennett, Laura Reinholdt, Mats Ljungman, Michael A. Beer, Mark B. Gerstein, Lior Pachter, Roderic Guigó, Barbara J. Wold, and Ali Mortazavi
- Subjects
Article - Abstract
The majority of mammalian genes encode multiple transcript isoforms that result from differential promoter use, changes in exonic splicing, and alternative 3’ end choice. Detecting and quantifying transcript isoforms across tissues, cell types, and species has been extremely challenging because transcripts are much longer than the short reads normally used for RNA-seq. By contrast, long-read RNA-seq (LR-RNA-seq) gives the complete structure of most transcripts. We sequenced 264 LR-RNA-seq PacBio libraries totaling over 1 billion circular consensus reads (CCS) for 81 unique human and mouse samples. We detect at least one full-length transcript from 87.7% of annotated human protein coding genes and a total of 200,000 full-length transcripts, 40% of which have novel exon junction chains.To capture and compute on the three sources of transcript structure diversity, we introduce a gene and transcript annotation framework that uses triplets representing the transcript start site, exon junction chain, and transcript end site of each transcript. Using triplets in a simplex representation demonstrates how promoter selection, splice pattern, and 3’ processing are deployed across human tissues, with nearly half of multitranscript protein coding genes showing a clear bias toward one of the three diversity mechanisms. Evaluated across samples, the predominantly expressed transcript changes for 74% of protein coding genes. In evolution, the human and mouse transcriptomes are globally similar in types of transcript structure diversity, yet among individual orthologous gene pairs, more than half (57.8%) show substantial differences in mechanism of diversification in matching tissues. This initial large-scale survey of human and mouse long-read transcriptomes provides a foundation for further analyses of alternative transcript usage, and is complemented by short-read and microRNA data on the same samples and by epigenome data elsewhere in the ENCODE4 collection.
- Published
- 2023
5. The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models
- Author
-
Joel Rozowsky, Jiahao Gao, Beatrice Borsari, Yucheng T. Yang, Timur Galeev, Gamze Gürsoy, Charles B. Epstein, Kun Xiong, Jinrui Xu, Tianxiao Li, Jason Liu, Keyang Yu, Ana Berthel, Zhanlin Chen, Fabio Navarro, Maxwell S. Sun, James Wright, Justin Chang, Christopher J.F. Cameron, Noam Shoresh, Elizabeth Gaskell, Jorg Drenkow, Jessika Adrian, Sergey Aganezov, François Aguet, Gabriela Balderrama-Gutierrez, Samridhi Banskota, Guillermo Barreto Corona, Sora Chee, Surya B. Chhetri, Gabriel Conte Cortez Martins, Cassidy Danyko, Carrie A. Davis, Daniel Farid, Nina P. Farrell, Idan Gabdank, Yoel Gofin, David U. Gorkin, Mengting Gu, Vivian Hecht, Benjamin C. Hitz, Robbyn Issner, Yunzhe Jiang, Melanie Kirsche, Xiangmeng Kong, Bonita R. Lam, Shantao Li, Bian Li, Xiqi Li, Khine Zin Lin, Ruibang Luo, Mark Mackiewicz, Ran Meng, Jill E. Moore, Jonathan Mudge, Nicholas Nelson, Chad Nusbaum, Ioann Popov, Henry E. Pratt, Yunjiang Qiu, Srividya Ramakrishnan, Joe Raymond, Leonidas Salichos, Alexandra Scavelli, Jacob M. Schreiber, Fritz J. Sedlazeck, Lei Hoon See, Rachel M. Sherman, Xu Shi, Minyi Shi, Cricket Alicia Sloan, J Seth Strattan, Zhen Tan, Forrest Y. Tanaka, Anna Vlasova, Jun Wang, Jonathan Werner, Brian Williams, Min Xu, Chengfei Yan, Lu Yu, Christopher Zaleski, Jing Zhang, Kristin Ardlie, J Michael Cherry, Eric M. Mendenhall, William S. Noble, Zhiping Weng, Morgan E. Levine, Alexander Dobin, Barbara Wold, Ali Mortazavi, Bing Ren, Jesse Gillis, Richard M. Myers, Michael P. Snyder, Jyoti Choudhary, Aleksandar Milosavljevic, Michael C. Schatz, Bradley E. Bernstein, Roderic Guigó, Thomas R. Gingeras, and Mark Gerstein
- Subjects
Allele-specific activity ,Predictive models ,Personal genome ,eQTLs ,Transformer model ,Functional genomics ,GTEx ,Genome annotations ,Structural variants ,General Biochemistry, Genetics and Molecular Biology ,Tissue specificity ,Functional epigenomes ,ENCODE - Abstract
Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × ∼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.
- Published
- 2023
6. Factorbook: an updated catalog of transcription factor motifs and candidate regulatory motif sites
- Author
-
Henry E Pratt, Gregory R Andrews, Nishigandha Phalke, Jack D Huey, Michael J Purcaro, Arjan van der Velde, Jill E Moore, and Zhiping Weng
- Subjects
Binding Sites ,AcademicSubjects/SCI00010 ,Computer science ,Computational biology ,Regulatory Sequences, Nucleic Acid ,Biology ,ENCODE ,Characteristic sequence ,Annotation ,Gene Expression Regulation ,Databases, Genetic ,Genetics ,Database Issue ,Humans ,TF binding ,Human genome ,Motif (music) ,Nucleotide Motifs ,Transcription factor ,Transcription Factors - Abstract
The human genome contains ∼2000 transcriptional regulatory proteins, including ∼1600 DNA-binding transcription factors (TFs) recognizing characteristic sequence motifs to exert regulatory effects on gene expression. The binding specificities of these factors have been profiled both in vitro, using techniques such as HT-SELEX, and in vivo, using techniques including ChIP-seq. We previously developed Factorbook, a TF-centric database of annotations, motifs, and integrative analyses based on ChIP-seq data from Phase II of the ENCODE Project. Here we present an update to Factorbook which significantly expands the breadth of cell type and TF coverage. The update includes an expanded motif catalog derived from thousands of ENCODE Phase II and III ChIP-seq experiments and HT-SELEX experiments; this motif catalog is integrated with the ENCODE registry of candidate cis-regulatory elements to annotate a comprehensive collection of genome-wide candidate TF binding sites. The database also offers novel tools for applying the motif models within machine learning frameworks and using these models for integrative analysis, including annotation of variants and disease and trait heritability. Factorbook is publicly available at www.factorbook.org; we will continue to expand the resource as ENCODE Phase IV data are released.
- Published
- 2021
- Full Text
- View/download PDF
7. Integration of high-resolution promoter profiling assays reveals novel, cell type-specific transcription start sites across 115 human cell and tissue types
- Author
-
Jill E. Moore, Xiao-Ou Zhang, Shaimae I. Elhajjajy, Kaili Fan, Henry E. Pratt, Fairlie Reese, Ali Mortazavi, and Zhiping Weng
- Subjects
Bioinformatics ,Human Genome ,Biological Sciences ,Medical and Health Sciences ,Promoter Regions ,Gene Expression Regulation ,Genetic ,Genetics ,Humans ,Generic health relevance ,Transcription Initiation Site ,Promoter Regions, Genetic ,Genetics (clinical) ,Genome-Wide Association Study ,Biotechnology - Abstract
Accurate transcription start site (TSS) annotations are essential for understanding transcriptional regulation and its role in human disease. Gene collections such as GENCODE contain annotations for tens of thousands of TSSs, but not all of these annotations are experimentally validated nor do they contain information on cell type–specific usage. Therefore, we sought to generate a collection of experimentally validated TSSs by integrating RNA Annotation and Mapping of Promoters for the Analysis of Gene Expression (RAMPAGE) data from 115 cell and tissue types, which resulted in a collection of approximately 50 thousand representative RAMPAGE peaks. These peaks are primarily proximal to GENCODE-annotated TSSs and are concordant with other transcription assays. Because RAMPAGE uses paired-end reads, we were then able to connect peaks to transcripts by analyzing the genomic positions of the 3′ ends of read mates. Using this paired-end information, we classified the vast majority (37 thousand) of our RAMPAGE peaks as verified TSSs, updating TSS annotations for 20% of GENCODE genes. We also found that these updated TSS annotations are supported by epigenomic and other transcriptomic data sets. To show the utility of this RAMPAGE rPeak collection, we intersected it with the NHGRI/EBI genome-wide association study (GWAS) catalog and identified new candidate GWAS genes. Overall, our work shows the importance of integrating experimental data to further refine TSS annotations and provides a valuable resource for the biological community.
- Published
- 2022
8. The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models
- Author
-
Joel Rozowsky, Jorg Drenkow, Yucheng T Yang, Gamze Gursoy, Timur Galeev, Beatrice Borsari, Charles B Epstein, Kun Xiong, Jinrui Xu, Jiahao Gao, Keyang Yu, Ana Berthel, Zhanlin Chen, Fabio Navarro, Jason Liu, Maxwell S Sun, James Wright, Justin Chang, Christopher JF Cameron, Noam Shoresh, Elizabeth Gaskell, Jessika Adrian, Sergey Aganezov, François Aguet, Gabriela Balderrama-Gutierrez, Samridhi Banskota, Guillermo Barreto Corona, Sora Chee, Surya B Chhetri, Gabriel Conte Cortez Martins, Cassidy Danyko, Carrie A Davis, Daniel Farid, Nina P Farrell, Idan Gabdank, Yoel Gofin, David U Gorkin, Mengting Gu, Vivian Hecht, Benjamin C Hitz, Robbyn Issner, Melanie Kirsche, Xiangmeng Kong, Bonita R Lam, Shantao Li, Bian Li, Tianxiao Li, Xiqi Li, Khine Zin Lin, Ruibang Luo, Mark Mackiewicz, Jill E Moore, Jonathan Mudge, Nicholas Nelson, Chad Nusbaum, Ioann Popov, Henry E Pratt, Yunjiang Qiu, Srividya Ramakrishnan, Joe Raymond, Leonidas Salichos, Alexandra Scavelli, Jacob M Schreiber, Fritz J Sedlazeck, Lei Hoon See, Rachel M Sherman, Xu Shi, Minyi Shi, Cricket Alicia Sloan, J Seth Strattan, Zhen Tan, Forrest Y Tanaka, Anna Vlasova, Jun Wang, Jonathan Werner, Brian Williams, Min Xu, Chengfei Yan, Lu Yu, Christopher Zaleski, Jing Zhang, Kristin Ardlie, J Michael Cherry, Eric M Mendenhall, William S Noble, Zhiping Weng, Morgan E Levine, Alexander Dobin, Barbara Wold, Ali Mortazavi, Bing Ren, Jesse Gillis, Richard M Myers, Michael P Snyder, Jyoti Choudhary, Aleksandar Milosavljevic, Michael C Schatz, Roderic Guigó, Bradley E Bernstein, Thomas R Gingeras, and Mark Gerstein
- Subjects
Genetic variants ,Genomics ,Preprint ,Computational biology ,Biology ,Personal genomics - Abstract
Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of personal epigenomes, for ∼25 tissues and >10 assays in four donors (>1500 open-access functional genomic and proteomic datasets, in total). Each dataset is mapped to a matched, diploid personal genome, which has long-read phasing and structural variants. The mappings enable us to identify >1 million loci with allele-specific behavior. These loci exhibit coordinated epigenetic activity along haplotypes and less conservation than matched, non-allele-specific loci, in a fashion broadly paralleling tissue-specificity. Surprisingly, they can be accurately modelled just based on local nucleotide-sequence context. Combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci and enables models for transferring known eQTLs to difficult-to-profile tissues. Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.