1. Principles of metadata organization at the ENCODE data coordination center
- Author
-
Benjamin C. Hitz, Aditi K. Narayanan, Jason A. Hilton, Idan Gabdank, Cricket A. Sloan, Venkat S. Malladi, J. Seth Strattan, J. Michael Cherry, Greg Roe, Jean M. Davidson, Forrest Y. Tanaka, Laurence D. Rowe, Eurie L. Hong, Timothy R. Dreszer, Nikhil R. Podduturi, Marcus Ho, Brian T. Lee, and Esther T. Chan
- Subjects
0301 basic medicine ,Quality Control ,Computer science ,ENCODE ,General Biochemistry, Genetics and Molecular Biology ,World Wide Web ,03 medical and health sciences ,Mice ,0302 clinical medicine ,Nucleic Acids ,Data file ,Databases, Genetic ,Animals ,Humans ,Caenorhabditis elegans ,Data collection ,Data element ,Data Collection ,Metadata standard ,Computational Biology ,High-Throughput Nucleotide Sequencing ,Reproducibility of Results ,DNA ,Metadata repository ,Metadata ,030104 developmental biology ,Drosophila melanogaster ,030220 oncology & carcinogenesis ,Encyclopedia ,Original Article ,General Agricultural and Biological Sciences ,Sequence Alignment ,Algorithms ,Information Systems - Abstract
The Encyclopedia of DNA Elements (ENCODE) Data Coordinating Center (DCC) is responsible for organizing, describing and providing access to the diverse data generated by the ENCODE project. The description of these data, known as metadata, includes the biological sample used as input, the protocols and assays performed on these samples, the data files generated from the results and the computational methods used to analyze the data. Here, we outline the principles and philosophy used to define the ENCODE metadata in order to create a metadata standard that can be applied to diverse assays and multiple genomic projects. In addition, we present how the data are validated and used by the ENCODE DCC in creating the ENCODE Portal (https://www.encodeproject.org/). Database URL: www.encodeproject.org.
- Published
- 2015