Back to Search
Start Over
Sustainable data and metadata management at the BD2K-LINCS Data Coordination and Integration Center
- Source :
- Scientific Data
- Publication Year :
- 2017
-
Abstract
- The NIH-funded LINCS Consortium is creating an extensive reference library of cell-based perturbation response signatures and sophisticated informatics tools incorporating a large number of perturbagens, model systems, and assays. To date, more than 350 datasets have been generated including transcriptomics, proteomics, epigenomics, cell phenotype and competitive binding profiling assays. The large volume and variety of data necessitate rigorous data standards and effective data management including modular data processing pipelines and end-user interfaces to facilitate accurate and reliable data exchange, curation, validation, standardization, aggregation, integration, and end user access. Deep metadata annotations and the use of qualified data standards enable integration with many external resources. Here we describe the end-to-end data processing and management at the DCIC to generate a high-quality and persistent product. Our data management and stewardship solutions enable a functioning Consortium and make LINCS a valuable scientific resource that aligns with big data initiatives such as the BD2K NIH Program and concords with emerging data science best practices including the findable, accessible, interoperable, and reusable (FAIR) principles.
- Subjects :
- 0301 basic medicine
Statistics and Probability
Standardization
Computer science
Data management
Big data
Datasets as Topic
Information Storage and Retrieval
Library and Information Sciences
Article
Education
03 medical and health sciences
0302 clinical medicine
Metadata management
Animals
Humans
Data Curation
Metadata
Data curation
business.industry
Data science
United States
Computer Science Applications
Computational biology and bioinformatics
Research data
030104 developmental biology
National Institutes of Health (U.S.)
Data exchange
Statistics, Probability and Uncertainty
business
Systems biology
030217 neurology & neurosurgery
Information Systems
Subjects
Details
- ISSN :
- 20524463
- Volume :
- 5
- Database :
- OpenAIRE
- Journal :
- Scientific data
- Accession number :
- edsair.doi.dedup.....625705c771ad24b2f61a383433696351