25 results on '"Benjamin C. Hitz"'
Search Results
2. RNAget: an API to securely retrieve RNA quantifications.
- Author
-
Sean Upchurch, Emilio Palumbo, Jeremy Adams, David Bujold, Guillaume Bourque, Jared Nedzel, Keenan Graham, Meenakshi S. Kagda, Pedro Assis, Benjamin C. Hitz, Emilio Righi, Roderic Guigó, Barbara J. Wold, Alvis Brazma, Julia Burchard, Joe Capka, Michael Cherry, Laura Clarke, Brian Craft, Manolis Dermitzakis, Mark Diekhans, John Dursi, Michael Sean Fitzsimons, Zac Flaming, Romina Garrido, Alfred Gil, Paul Godden, Matt Green, Mitch Guttman, Brian Haas, Max Haeussler, Bo Li, Sten Linnarsson, Adam Lipski, David Liu, Simonne Longerich, David Lougheed, Jonathan Manning, John C. Marioni, Christopher Meyer, Stephen B. Montgomery, Alyssa Morrow, Alfonso Muñoz-Pomer Fuentes, Jared L. Nedzel, David Nguyen, Kevin Osborn, Francis Ouellette, Irene Papatheodorou, Dmitri D. Pervouchine, Arun K. Ramani, Jordi Rambla, Bashir Sadjad, David Steinberg, Jeremiah Talkar, Timothy Tickle, Kathy Tzeng, Saman Vaisipour, Sean Watford, Barbara Wold, Zhenyu Zhang, and Jing Zhu
- Published
- 2023
- Full Text
- View/download PDF
3. New developments on the Encyclopedia of DNA Elements (ENCODE) data portal.
- Author
-
Yunhai Luo, Benjamin C. Hitz, Idan Gabdank, Jason A. Hilton, Meenakshi S. Kagda, Bonita Lam, Zachary Myers, Paul Sud, Jennifer Jou, Khine Lin, Ulugbek K. Baymuradov, Keenan Graham, Casey Litton, Stuart R. Miyasato, J. Seth Strattan, Otto Jolanki, Jin-Wook Lee, Forrest Tanaka, Philip Adenekan, Emma O'Neill, and J. Michael Cherry
- Published
- 2020
- Full Text
- View/download PDF
4. The Encyclopedia of DNA elements (ENCODE): data portal update.
- Author
-
Carrie A. Davis, Benjamin C. Hitz, Cricket A. Sloan, Esther T. Chan, Jean M. Davidson, Idan Gabdank, Jason A. Hilton, Kriti Jain, Ulugbek K. Baymuradov, Aditi K. Narayanan, Kathrina C. Onate, Keenan Graham, Stuart R. Miyasato, Timothy R. Dreszer, J. Seth Strattan, Otto Jolanki, Forrest Tanaka, and J. Michael Cherry
- Published
- 2018
- Full Text
- View/download PDF
5. Annotating and prioritizing human non-coding variants with RegulomeDB v.2
- Author
-
Shengcheng Dong, Nanxiang Zhao, Emma Spragins, Meenakshi S. Kagda, Mingjie Li, Pedro Assis, Otto Jolanki, Yunhai Luo, J. Michael Cherry, Alan P. Boyle, and Benjamin C. Hitz
- Subjects
Genetics - Published
- 2023
- Full Text
- View/download PDF
6. The ENCODE Uniform Analysis Pipelines
- Author
-
Benjamin C. Hitz, Jin-Wook Lee, Otto Jolanki, Meenakshi S. Kagda, Keenan Graham, Paul Sud, Idan Gabdank, J. Seth Strattan, Cricket A. Sloan, Timothy Dreszer, Laurence D. Rowe, Nikhil R. Podduturi, Venkat S. Malladi, Esther T. Chan, Jean M. Davidson, Marcus Ho, Stuart Miyasato, Matt Simison, Forrest Tanaka, Yunhai Luo, Ian Whaling, Eurie L. Hong, Brian T. Lee, Richard Sandstrom, Eric Rynes, Jemma Nelson, Andrew Nishida, Alyssa Ingersoll, Michael Buckley, Mark Frerker, Daniel S Kim, Nathan Boley, Diane Trout, Alex Dobin, Sorena Rahmanian, Dana Wyman, Gabriela Balderrama-Gutierrez, Fairlie Reese, Neva C. Durand, Olga Dudchenko, David Weisz, Suhas S. P. Rao, Alyssa Blackburn, Dimos Gkountaroulis, Mahdi Sadr, Moshe Olshansky, Yossi Eliaz, Dat Nguyen, Ivan Bochkov, Muhammad Saad Shamim, Ragini Mahajan, Erez Aiden, Tom Gingeras, Simon Heath, Martin Hirst, W. James Kent, Anshul Kundaje, Ali Mortazavi, Barbara Wold, and J. Michael Cherry
- Abstract
The Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of theHomo sapiensandMus musculusgenomes. All experimental data, metadata, and associated computational analyses created by the ENCODE consortium are submitted to the Data Coordination Center (DCC) for validation, tracking, storage, and distribution to community resources and the scientific community. The ENCODE project has engineered and distributed uniform processing pipelines in order to promote data provenance and reproducibility as well as allow interoperability between genomic resources and other consortia. All data files, reference genome versions, software versions, and parameters used by the pipelines are captured and availableviathe ENCODE Portal. The pipeline code, developed using Docker and Workflow Description Language (WDL;https://openwdl.org/) is publicly available in GitHub, with images available on Dockerhub (https://hub.docker.com), enabling access to a diverse range of biomedical researchers. ENCODE pipelines maintained and used by the DCC can be installed to run on personal computers, local HPC clusters, or in cloud computing environmentsviaCromwell. Access to the pipelines and dataviathe cloud allows small labs the ability to use the data or software without access to institutional compute clusters. Standardization of the computational methodologies for analysis and quality control leads to comparable results from different ENCODE collections - a prerequisite for successful integrative analyses.Database URL:https://www.encodeproject.org/
- Published
- 2023
- Full Text
- View/download PDF
7. ENCODE data at the ENCODE portal.
- Author
-
Cricket A. Sloan, Esther T. Chan, Jean M. Davidson, Venkat S. Malladi, J. Seth Strattan, Benjamin C. Hitz, Idan Gabdank, Aditi K. Narayanan, Marcus Ho, Brian T. Lee, Laurence D. Rowe, Timothy R. Dreszer, Greg Roe, Nikhil R. Podduturi, Forrest Tanaka, Eurie L. Hong, and J. Michael Cherry
- Published
- 2016
- Full Text
- View/download PDF
8. The Saccharomyces Genome Database Variant Viewer.
- Author
-
Travis K. Sheppard, Benjamin C. Hitz, Stacia R. Engel, Giltae Song, Rama Balakrishnan, Gail Binkley, Maria C. Costanzo, Kyla S. Dalusag, Janos Demeter, Sage T. Hellerstedt, Kalpana Karra, Robert S. Nash, Kelley M. Paskov, Marek S. Skrzypek, Shuai Weng, Edith D. Wong, and J. Michael Cherry
- Published
- 2016
- Full Text
- View/download PDF
9. The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models
- Author
-
Joel Rozowsky, Jiahao Gao, Beatrice Borsari, Yucheng T. Yang, Timur Galeev, Gamze Gürsoy, Charles B. Epstein, Kun Xiong, Jinrui Xu, Tianxiao Li, Jason Liu, Keyang Yu, Ana Berthel, Zhanlin Chen, Fabio Navarro, Maxwell S. Sun, James Wright, Justin Chang, Christopher J.F. Cameron, Noam Shoresh, Elizabeth Gaskell, Jorg Drenkow, Jessika Adrian, Sergey Aganezov, François Aguet, Gabriela Balderrama-Gutierrez, Samridhi Banskota, Guillermo Barreto Corona, Sora Chee, Surya B. Chhetri, Gabriel Conte Cortez Martins, Cassidy Danyko, Carrie A. Davis, Daniel Farid, Nina P. Farrell, Idan Gabdank, Yoel Gofin, David U. Gorkin, Mengting Gu, Vivian Hecht, Benjamin C. Hitz, Robbyn Issner, Yunzhe Jiang, Melanie Kirsche, Xiangmeng Kong, Bonita R. Lam, Shantao Li, Bian Li, Xiqi Li, Khine Zin Lin, Ruibang Luo, Mark Mackiewicz, Ran Meng, Jill E. Moore, Jonathan Mudge, Nicholas Nelson, Chad Nusbaum, Ioann Popov, Henry E. Pratt, Yunjiang Qiu, Srividya Ramakrishnan, Joe Raymond, Leonidas Salichos, Alexandra Scavelli, Jacob M. Schreiber, Fritz J. Sedlazeck, Lei Hoon See, Rachel M. Sherman, Xu Shi, Minyi Shi, Cricket Alicia Sloan, J Seth Strattan, Zhen Tan, Forrest Y. Tanaka, Anna Vlasova, Jun Wang, Jonathan Werner, Brian Williams, Min Xu, Chengfei Yan, Lu Yu, Christopher Zaleski, Jing Zhang, Kristin Ardlie, J Michael Cherry, Eric M. Mendenhall, William S. Noble, Zhiping Weng, Morgan E. Levine, Alexander Dobin, Barbara Wold, Ali Mortazavi, Bing Ren, Jesse Gillis, Richard M. Myers, Michael P. Snyder, Jyoti Choudhary, Aleksandar Milosavljevic, Michael C. Schatz, Bradley E. Bernstein, Roderic Guigó, Thomas R. Gingeras, and Mark Gerstein
- Subjects
Allele-specific activity ,Predictive models ,Personal genome ,eQTLs ,Transformer model ,Functional genomics ,GTEx ,Genome annotations ,Structural variants ,General Biochemistry, Genetics and Molecular Biology ,Tissue specificity ,Functional epigenomes ,ENCODE - Abstract
Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × ∼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.
- Published
- 2023
10. Annotating and prioritizing human non-coding variants with RegulomeDB
- Author
-
Shengcheng Dong, Nanxiang Zhao, Emma Spragins, Meenakshi S. Kagda, Mingjie Li, Pedro Assis, Otto Jolanki, Yunhai Luo, J Michael Cherry, Alan P Boyle, and Benjamin C Hitz
- Abstract
Nearly 90% of the disease risk-associated variants identified from genome-wide association studies (GWAS) are in non-coding regions of the genome. The annotations obtained from analyzing functional genomics assays can provide additional information to pinpoint causal variants, which are often not the lead variants identified from association studies. However, the lack of available annotation tools limits the use of such data.To address the challenge, we have previously built the RegulomeDB database for prioritizing and annotating variants in non-coding regions1, which has been a highly utilized resource for the research community (Supplementary Fig. 1). RegulomeDB annotates a variant by intersecting its position with genomic intervals identified from functional genomic assays and computational approaches. It also incorporates those hits of a variant into a heuristic ranking score, representing its potential to be functional in regulatory elements.Here we present a newer version of the RegulomeDB web server, RegulomeDB v2.1 (http://regulomedb.org). We improve and boost annotation power by incorporating thousands of newly processed data from functional genomic assays in GRCh38 assembly, and now include probabilistic scores from the SURF algorithm that was the top performing non-coding variant predictor in CAGI 52. We also provide interactive charts and genome browser views to allow users an easy way to perform exploratory analyses in different tissue contexts.
- Published
- 2022
- Full Text
- View/download PDF
11. SnoVault and encodeD: A novel object-based storage system and applications to ENCODE metadata.
- Author
-
Benjamin C Hitz, Laurence D Rowe, Nikhil R Podduturi, David I Glick, Ulugbek K Baymuradov, Venkat S Malladi, Esther T Chan, Jean M Davidson, Idan Gabdank, Aditi K Narayana, Kathrina C Onate, Jason Hilton, Marcus C Ho, Brian T Lee, Stuart R Miyasato, Timothy R Dreszer, Cricket A Sloan, J Seth Strattan, Forrest Y Tanaka, Eurie L Hong, and J Michael Cherry
- Subjects
Medicine ,Science - Abstract
The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a comprehensive catalog of functional elements initiated shortly after the completion of the Human Genome Project. The current database exceeds 6500 experiments across more than 450 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the H. sapiens and M. musculus genomes. All ENCODE experimental data, metadata, and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage, unified processing, and distribution to community resources and the scientific community. As the volume of data increases, the identification and organization of experimental details becomes increasingly intricate and demands careful curation. The ENCODE DCC has created a general purpose software system, known as SnoVault, that supports metadata and file submission, a database used for metadata storage, web pages for displaying the metadata and a robust API for querying the metadata. The software is fully open-source, code and installation instructions can be found at: http://github.com/ENCODE-DCC/snovault/ (for the generic database) and http://github.com/ENCODE-DCC/encoded/ to store genomic data in the manner of ENCODE. The core database engine, SnoVault (which is completely independent of ENCODE, genomic data, or bioinformatic data) has been released as a separate Python package.
- Published
- 2017
- Full Text
- View/download PDF
12. Principles of metadata organization at the ENCODE data coordination center.
- Author
-
Eurie L. Hong, Cricket A. Sloan, Esther T. Chan, Jean M. Davidson, Venkat S. Malladi, J. Seth Strattan, Benjamin C. Hitz, Idan Gabdank, Aditi K. Narayanan, Marcus Ho, Brian T. Lee, Laurence D. Rowe, Timothy R. Dreszer, Greg R. Roe, Nikhil R. Podduturi, Forrest Tanaka, Jason A. Hilton, and J. Michael Cherry
- Published
- 2016
- Full Text
- View/download PDF
13. Integration of new alternative reference strain genome sequences into the Saccharomyces genome database.
- Author
-
Giltae Song, Rama Balakrishnan, Gail Binkley, Maria C. Costanzo, Kyla S. Dalusag, Janos Demeter, Stacia R. Engel, Sage T. Hellerstedt, Kalpana Karra, Benjamin C. Hitz, Robert S. Nash, Kelley M. Paskov, Travis K. Sheppard, Marek S. Skrzypek, Shuai Weng, Edith D. Wong, and J. Michael Cherry
- Published
- 2016
- Full Text
- View/download PDF
14. Prevention of data duplication for high throughput sequencing repositories.
- Author
-
Idan Gabdank, Esther T. Chan, Jean M. Davidson, Jason A. Hilton, Carrie A. Davis, Ulugbek K. Baymuradov, Aditi K. Narayanan, Kathrina C. Onate, Keenan Graham, Stuart R. Miyasato, Timothy R. Dreszer, J. Seth Strattan, Otto Jolanki, Forrest Tanaka, Benjamin C. Hitz, Cricket A. Sloan, and J. Michael Cherry
- Published
- 2018
- Full Text
- View/download PDF
15. Ontology application and use at the ENCODE DCC.
- Author
-
Venkat S. Malladi, Drew T. Erickson, Nikhil R. Podduturi, Laurence D. Rowe, Esther T. Chan, Jean M. Davidson, Benjamin C. Hitz, Marcus Ho, Brian T. Lee, Stuart R. Miyasato, Greg R. Roe, Matt Simison, Cricket A. Sloan, J. Seth Strattan, Forrest Tanaka, W. James Kent, J. Michael Cherry, and Eurie L. Hong
- Published
- 2015
- Full Text
- View/download PDF
16. New developments on the Encyclopedia of DNA Elements (ENCODE) data portal
- Author
-
Casey Litton, Zachary Myers, Ulugbek K. Baymuradov, Benjamin C. Hitz, Meenakshi S. Kagda, Otto Jolanki, Jin-Wook Lee, Stuart R. Miyasato, Keenan Graham, Idan Gabdank, Forrest Y. Tanaka, Bonita R. Lam, J. Seth Strattan, Jason A. Hilton, J. Michael Cherry, Yunhai Luo, Philip Adenekan, Paul Sud, Emma O'Neill, Jennifer Jou, and Khine Lin
- Subjects
Interoperability ,Cloud computing ,Data_CODINGANDINFORMATIONTHEORY ,Biology ,ENCODE ,World Wide Web ,Mice ,03 medical and health sciences ,0302 clinical medicine ,Documentation ,Software ,Databases, Genetic ,Genetics ,Database Issue ,Animals ,Humans ,030304 developmental biology ,0303 health sciences ,Genome, Human ,business.industry ,DNA ,Genomics ,Visualization ,Open data ,Encyclopedia ,business ,030217 neurology & neurosurgery - Abstract
The Encyclopedia of DNA Elements (ENCODE) is an ongoing collaborative research project aimed at identifying all the functional elements in the human and mouse genomes. Data generated by the ENCODE consortium are freely accessible at the ENCODE portal (https://www.encodeproject.org/), which is developed and maintained by the ENCODE Data Coordinating Center (DCC). Since the initial portal release in 2013, the ENCODE DCC has updated the portal to make ENCODE data more findable, accessible, interoperable and reusable. Here, we report on recent updates, including new ENCODE data and assays, ENCODE uniform data processing pipelines, new visualization tools, a dataset cart feature, unrestricted public access to ENCODE data on the cloud (Amazon Web Services open data registry, https://registry.opendata.aws/encode-project/) and more comprehensive tutorials and documentation.
- Published
- 2019
- Full Text
- View/download PDF
17. The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models
- Author
-
Joel Rozowsky, Jorg Drenkow, Yucheng T Yang, Gamze Gursoy, Timur Galeev, Beatrice Borsari, Charles B Epstein, Kun Xiong, Jinrui Xu, Jiahao Gao, Keyang Yu, Ana Berthel, Zhanlin Chen, Fabio Navarro, Jason Liu, Maxwell S Sun, James Wright, Justin Chang, Christopher JF Cameron, Noam Shoresh, Elizabeth Gaskell, Jessika Adrian, Sergey Aganezov, François Aguet, Gabriela Balderrama-Gutierrez, Samridhi Banskota, Guillermo Barreto Corona, Sora Chee, Surya B Chhetri, Gabriel Conte Cortez Martins, Cassidy Danyko, Carrie A Davis, Daniel Farid, Nina P Farrell, Idan Gabdank, Yoel Gofin, David U Gorkin, Mengting Gu, Vivian Hecht, Benjamin C Hitz, Robbyn Issner, Melanie Kirsche, Xiangmeng Kong, Bonita R Lam, Shantao Li, Bian Li, Tianxiao Li, Xiqi Li, Khine Zin Lin, Ruibang Luo, Mark Mackiewicz, Jill E Moore, Jonathan Mudge, Nicholas Nelson, Chad Nusbaum, Ioann Popov, Henry E Pratt, Yunjiang Qiu, Srividya Ramakrishnan, Joe Raymond, Leonidas Salichos, Alexandra Scavelli, Jacob M Schreiber, Fritz J Sedlazeck, Lei Hoon See, Rachel M Sherman, Xu Shi, Minyi Shi, Cricket Alicia Sloan, J Seth Strattan, Zhen Tan, Forrest Y Tanaka, Anna Vlasova, Jun Wang, Jonathan Werner, Brian Williams, Min Xu, Chengfei Yan, Lu Yu, Christopher Zaleski, Jing Zhang, Kristin Ardlie, J Michael Cherry, Eric M Mendenhall, William S Noble, Zhiping Weng, Morgan E Levine, Alexander Dobin, Barbara Wold, Ali Mortazavi, Bing Ren, Jesse Gillis, Richard M Myers, Michael P Snyder, Jyoti Choudhary, Aleksandar Milosavljevic, Michael C Schatz, Roderic Guigó, Bradley E Bernstein, Thomas R Gingeras, and Mark Gerstein
- Subjects
Genetic variants ,Genomics ,Preprint ,Computational biology ,Biology ,Personal genomics - Abstract
Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of personal epigenomes, for ∼25 tissues and >10 assays in four donors (>1500 open-access functional genomic and proteomic datasets, in total). Each dataset is mapped to a matched, diploid personal genome, which has long-read phasing and structural variants. The mappings enable us to identify >1 million loci with allele-specific behavior. These loci exhibit coordinated epigenetic activity along haplotypes and less conservation than matched, non-allele-specific loci, in a fashion broadly paralleling tissue-specificity. Surprisingly, they can be accurately modelled just based on local nucleotide-sequence context. Combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci and enables models for transferring known eQTLs to difficult-to-profile tissues. Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.
- Published
- 2021
- Full Text
- View/download PDF
18. The Encyclopedia of DNA elements (ENCODE): data portal update
- Author
-
Aditi K. Narayanan, Benjamin C. Hitz, Timothy R. Dreszer, Kriti Jain, Otto Jolanki, Idan Gabdank, Keenan Graham, Kathrina C. Onate, Jason A. Hilton, Stuart R. Miyasato, J. Michael Cherry, Cricket A. Sloan, J. Seth Strattan, Carrie A. Davis, Esther T. Chan, Jean M. Davidson, Forrest Y. Tanaka, and Ulugbek K. Baymuradov
- Subjects
0301 basic medicine ,Download ,Interface (Java) ,Datasets as Topic ,Genomics ,Biology ,Bioinformatics ,ENCODE ,World Wide Web ,03 medical and health sciences ,Mice ,User-Computer Interface ,Databases, Genetic ,Genetics ,Database Issue ,Animals ,Humans ,Caenorhabditis elegans ,Metadata ,Genome, Human ,High-Throughput Nucleotide Sequencing ,DNA ,Visualization ,030104 developmental biology ,Drosophila melanogaster ,Gene Components ,Encyclopedia ,Data Display ,Forecasting - Abstract
The Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center has developed the ENCODE Portal database and website as the source for the data and metadata generated by the ENCODE Consortium. Two principles have motivated the design. First, experimental protocols, analytical procedures and the data themselves should be made publicly accessible through a coherent, web-based search and download interface. Second, the same interface should serve carefully curated metadata that record the provenance of the data and justify its interpretation in biological terms. Since its initial release in 2013 and in response to recommendations from consortium members and the wider community of scientists who use the Portal to access ENCODE data, the Portal has been regularly updated to better reflect these design principles. Here we report on these updates, including results from new experiments, uniformly-processed data from other projects, new visualization tools and more comprehensive metadata to describe experiments and analyses. Additionally, the Portal is now home to meta(data) from related projects including Genomics of Gene Regulation, Roadmap Epigenome Project, Model organism ENCODE (modENCODE) and modERN. The Portal now makes available over 13000 datasets and their accompanying metadata and can be accessed at: https://www.encodeproject.org/.
- Published
- 2017
19. The ENCODE Portal as an Epigenomics Resource
- Author
-
J. Seth Strattan, Khine Lin, Keenan Graham, Casey Litton, Emma O'Neill, Philip Adenekan, Jason A. Hilton, Paul Sud, Benjamin C. Hitz, Idan Gabdank, J. Michael Cherry, Yunhai Luo, Forrest Y. Tanaka, Zachary Myers, Jennifer Jou, Stuart R. Miyasato, Ulugbek K. Baymuradov, Otto Jolanki, Meenakshi S. Kagda, Jin-Wook Lee, and Bonita R. Lam
- Subjects
Epigenomics ,Computer science ,Genomics ,ENCODE ,Article ,03 medical and health sciences ,Mice ,Data file ,Databases, Genetic ,Animals ,Humans ,Protocol (object-oriented programming) ,030304 developmental biology ,0303 health sciences ,Internet ,Metadata ,Information retrieval ,Genome, Human ,030305 genetics & heredity ,General Medicine ,DNA ,DNA Methylation ,Metadata modeling ,Chromatin ,ComputingMethodologies_PATTERNRECOGNITION ,Human genome ,Software - Abstract
The Encyclopedia of DNA Elements (ENCODE) web portal hosts genomic data generated by the ENCODE Consortium, Genomics of Gene Regulation, The NIH Roadmap Epigenomics Consortium, and the modENCODE and modERN projects. The goal of the ENCODE project is to build a comprehensive map of the functional elements of the human and mouse genomes. Currently, the portal database stores over 500 TB of raw and processed data from over 15,000 experiments spanning assays that measure gene expression, DNA accessibility, DNA and RNA binding, DNA methylation, and 3D chromatin structure across numerous cell lines, tissue types, and differentiation states with selected genetic and molecular perturbations. The ENCODE portal provides unrestricted access to the aforementioned data and relevant metadata as a service to the scientific community. The metadata model captures the details of the experiments, raw and processed data files, and processing pipelines in human and machine-readable form and enables the user to search for specific data either using a web browser or programmatically via REST API. Furthermore, ENCODE data can be freely visualized or downloaded for additional analyses. © 2019 The Authors. Basic Protocol: Query the portal Support Protocol 1: Batch downloading Support Protocol 2: Using the cart to download files Support Protocol 3: Visualize data Alternate Protocol: Query building and programmatic access.
- Published
- 2019
20. ENCODE data at the ENCODE portal
- Author
-
Forrest Y. Tanaka, Esther T. Chan, Marcus Ho, Cricket A. Sloan, Nikhil R. Podduturi, J. Seth Strattan, Eurie L. Hong, Jean M. Davidson, Benjamin C. Hitz, Brian T. Lee, Greg Roe, Timothy R. Dreszer, Laurence D. Rowe, Idan Gabdank, Aditi K. Narayanan, Venkat S. Malladi, and J. Michael Cherry
- Subjects
0301 basic medicine ,Genomics ,Computational biology ,Biology ,ENCODE ,Genome ,Mice ,03 medical and health sciences ,Databases, Genetic ,Genetics ,Animals ,Humans ,Database Issue ,Gene ,Genome, Human ,Proteins ,DNA ,Visualization ,Metadata ,ComputingMethodologies_PATTERNRECOGNITION ,030104 developmental biology ,Genes ,DNA methylation ,RNA ,Human genome - Abstract
The Encyclopedia of DNA Elements (ENCODE) Project is in its third phase of creating a comprehensive catalog of functional elements in the human genome. This phase of the project includes an expansion of assays that measure diverse RNA populations, identify proteins that interact with RNA and DNA, probe regions of DNA hypersensitivity, and measure levels of DNA methylation in a wide range of cell and tissue types to identify putative regulatory elements. To date, results for almost 5000 experiments have been released for use by the scientific community. These data are available for searching, visualization and download at the new ENCODE Portal (www.encodeproject.org). The revamped ENCODE Portal provides new ways to browse and search the ENCODE data based on the metadata that describe the assays as well as summaries of the assays that focus on data provenance. In addition, it is a flexible platform that allows integration of genomic data from multiple projects. The portal experience was designed to improve access to ENCODE data by relying on metadata that allow reusability and reproducibility of the experiments.
- Published
- 2015
- Full Text
- View/download PDF
21. The Saccharomyces Genome Database Variant Viewer
- Author
-
Janos Demeter, Marek S. Skrzypek, Benjamin C. Hitz, Robert S. Nash, Kalpana Karra, Stacia R. Engel, Sage T. Hellerstedt, Gail Binkley, J. Michael Cherry, Maria C. Costanzo, Rama Balakrishnan, Travis K. Sheppard, Kelley Paskov, Giltae Song, Edith D. Wong, Shuai Weng, and Kyla S. Dalusag
- Subjects
0301 basic medicine ,Sequence analysis ,Saccharomyces cerevisiae ,Sequence alignment ,Computational biology ,Genome ,Saccharomyces ,03 medical and health sciences ,Annotation ,User-Computer Interface ,Sequence Analysis, Protein ,Databases, Genetic ,Genetics ,Database Issue ,natural sciences ,Sequence (medicine) ,biology ,Genetic Variation ,Molecular Sequence Annotation ,Sequence Analysis, DNA ,biology.organism_classification ,030104 developmental biology ,Genome, Fungal ,Sequence Alignment - Abstract
The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. In recent years, we have moved toward increased representation of sequence variation and allelic differences within S. cerevisiae. The publication of numerous additional genomes has motivated the creation of new tools for their annotation and analysis. Here we present the Variant Viewer: a dynamic open-source web application for the visualization of genomic and proteomic differences. Multiple sequence alignments have been constructed across high quality genome sequences from 11 different S. cerevisiae strains and stored in the SGD. The alignments and summaries are encoded in JSON and used to create a two-tiered dynamic view of the budding yeast pan-genome, available at http://www.yeastgenome.org/variant-viewer.
- Published
- 2015
22. Prevention of data duplication for high throughput sequencing repositories
- Author
-
J. Seth Strattan, Carrie A. Davis, Forrest Y. Tanaka, Benjamin C. Hitz, J. Michael Cherry, Keenan Graham, Jean M. Davidson, Jason A. Hilton, Idan Gabdank, Kathrina C. Onate, Stuart R. Miyasato, Otto Jolanki, Timothy R. Dreszer, Esther T. Chan, Aditi K. Narayanan, Ulugbek K. Baymuradov, and Cricket A. Sloan
- Subjects
0301 basic medicine ,Computer science ,business.industry ,Extramural ,MEDLINE ,Computational biology ,General Biochemistry, Genetics and Molecular Biology ,DNA sequencing ,03 medical and health sciences ,030104 developmental biology ,0302 clinical medicine ,Text mining ,Data deduplication ,Original Article ,Databases, Nucleic Acid ,General Agricultural and Biological Sciences ,business ,Data Curation ,030217 neurology & neurosurgery ,Information Systems - Abstract
Prevention of unintended duplication is one of the ongoing challenges many databases have to address. Working with high-throughput sequencing data, the complexity of that challenge increases with the complexity of the definition of a duplicate. In a computational data model, a data object represents a real entity like a reagent or a biosample. This representation is similar to how a card represents a book in a paper library catalog. Duplicated data objects not only waste storage, they can mislead users into assuming the model represents more than the single entity. Even if it is clear that two objects represent a single entity, data duplication opens the door to potential inconsistencies between the objects since the content of the duplicated objects can be updated independently, allowing divergence of the metadata associated with the objects. Analogously to a situation in which a catalog in a paper library would contain by mistake two cards for a single copy of a book. If these cards are listing simultaneously two different individuals as current book borrowers, it would be difficult to determine which borrower (out of the two listed) actually has the book. Unfortunately, in a large database with multiple submitters, unintended duplication is to be expected. In this article, we present three principal guidelines the Encyclopedia of DNA Elements (ENCODE) Portal follows in order to prevent unintended duplication of both actual files and data objects: definition of identifiable data objects (I), object uniqueness validation (II) and de-duplication mechanism (III). In addition to explaining our modus operandi, we elaborate on the methods used for identification of sequencing data files. Comparison of the approach taken by the ENCODE Portal vs other widely used biological data repositories is provided. Database URL: https://www.encodeproject.org/
- Published
- 2018
- Full Text
- View/download PDF
23. Principles of metadata organization at the ENCODE data coordination center
- Author
-
Benjamin C. Hitz, Aditi K. Narayanan, Jason A. Hilton, Idan Gabdank, Cricket A. Sloan, Venkat S. Malladi, J. Seth Strattan, J. Michael Cherry, Greg Roe, Jean M. Davidson, Forrest Y. Tanaka, Laurence D. Rowe, Eurie L. Hong, Timothy R. Dreszer, Nikhil R. Podduturi, Marcus Ho, Brian T. Lee, and Esther T. Chan
- Subjects
0301 basic medicine ,Quality Control ,Computer science ,ENCODE ,General Biochemistry, Genetics and Molecular Biology ,World Wide Web ,03 medical and health sciences ,Mice ,0302 clinical medicine ,Nucleic Acids ,Data file ,Databases, Genetic ,Animals ,Humans ,Caenorhabditis elegans ,Data collection ,Data element ,Data Collection ,Metadata standard ,Computational Biology ,High-Throughput Nucleotide Sequencing ,Reproducibility of Results ,DNA ,Metadata repository ,Metadata ,030104 developmental biology ,Drosophila melanogaster ,030220 oncology & carcinogenesis ,Encyclopedia ,Original Article ,General Agricultural and Biological Sciences ,Sequence Alignment ,Algorithms ,Information Systems - Abstract
The Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center (DCC) is responsible for organizing, describing and providing access to the diverse data generated by the ENCODE project. The description of these data, known as metadata, includes the biological sample used as input, the protocols and assays performed on these samples, the data files generated from the results and the computational methods used to analyze the data. Here, we outline the principles and philosophy used to define the ENCODE metadata in order to create a metadata standard that can be applied to diverse assays and multiple genomic projects. In addition, we present how the data are validated and used by the ENCODE DCC in creating the ENCODE Portal (https://www.encodeproject.org/). Database URL: www.encodeproject.org.
- Published
- 2015
24. Ontology application and use at the ENCODE DCC
- Author
-
Marcus Ho, Stuart R. Miyasato, W. James Kent, J. Seth Strattan, Jean M. Davidson, Nikhil R. Podduturi, Cricket A. Sloan, Greg Roe, Eurie L. Hong, Laurence D. Rowe, Brian T. Lee, Esther T. Chan, J. Michael Cherry, Drew T. Erickson, Forrest Y. Tanaka, Benjamin C. Hitz, Venkat S. Malladi, and Matt Simison
- Subjects
Information retrieval ,Transcription, Genetic ,Standardization ,Computer science ,Experimental data ,Molecular Sequence Annotation ,Ontology (information science) ,ENCODE ,General Biochemistry, Genetics and Molecular Biology ,Set (abstract data type) ,World Wide Web ,Metadata ,Mice ,Gene Ontology ,Databases, Genetic ,Encyclopedia ,Animals ,Humans ,Original Article ,Gene Regulatory Networks ,General Agricultural and Biological Sciences ,Data Curation ,Information Systems - Abstract
The Encyclopedia of DNA elements (ENCODE) project is an ongoing collaborative effort to create a catalog of genomic annotations. To date, the project has generated over 4000 experiments across more than 350 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory network and transcriptional landscape of the Homo sapiens and Mus musculus genomes. All ENCODE experimental data, metadata and associated computational analyses are submitted to the ENCODE Data Coordination Center (DCC) for validation, tracking, storage and distribution to community resources and the scientific community. As the volume of data increases, the organization of experimental details becomes increasingly complicated and demands careful curation to identify related experiments. Here, we describe the ENCODE DCC’s use of ontologies to standardize experimental metadata. We discuss how ontologies, when used to annotate metadata, provide improved searching capabilities and facilitate the ability to find connections within a set of experiments. Additionally, we provide examples of how ontologies are used to annotate ENCODE metadata and how the annotations can be identified via ontology-driven searches at the ENCODE portal. As genomic datasets grow larger and more interconnected, standardization of metadata becomes increasingly vital to allow for exploration and comparison of data between different scientific projects. Database URL: https://www.encodeproject.org/
- Published
- 2015
- Full Text
- View/download PDF
25. Integration of new alternative reference strain genome sequences into theSaccharomycesgenome database
- Author
-
Rama Balakrishnan, Sage T. Hellerstedt, J. Michael Cherry, Janos Demeter, Edith D. Wong, Stacia R. Engel, Gail Binkley, Marek S. Skrzypek, Travis K. Sheppard, Maria C. Costanzo, Robert S. Nash, Kelley Paskov, Kalpana Karra, Shuai Weng, Giltae Song, Kyla S. Dalusag, and Benjamin C. Hitz
- Subjects
0301 basic medicine ,Saccharomyces cerevisiae ,Locus (genetics) ,Biology ,ENCODE ,Genome ,General Biochemistry, Genetics and Molecular Biology ,Saccharomyces ,User-Computer Interface ,03 medical and health sciences ,Protein sequencing ,Databases, Genetic ,natural sciences ,Gene ,Genetics ,Reproducibility of Results ,Molecular Sequence Annotation ,Genomics ,Genome project ,biology.organism_classification ,030104 developmental biology ,Database Update ,Genome, Fungal ,General Agricultural and Biological Sciences ,Information Systems ,Reference genome - Abstract
The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. To provide a wider scope of genetic and phenotypic variation in yeast, the genome sequences and their corresponding annotations from 11 alternative S. cerevisiae reference strains have been integrated into SGD. Genomic and protein sequence information for genes from these strains are now available on the Sequence and Protein tab of the corresponding Locus Summary pages. We illustrate how these genome sequences can be utilized to aid our understanding of strain-specific functional and phenotypic differences. Database URL: www.yeastgenome.org
- Published
- 2016
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.