Back to Search Start Over

Scientific Data Management in the Age of Big Data: An Approach Supporting a Resilience Index Development Effort.

Authors :
Harwell LC
Vivian DN
McLaughlin MD
Hafner SF
Source :
Frontiers in environmental science [Front Environ Sci] 2019 Jun 04; Vol. 7 (Article 72), pp. 1-13.
Publication Year :
2019

Abstract

The increased availability of publicly available data is, in many ways, changing our approach to conducting research. Not only are cloud-based information resources providing supplementary data to bolster traditional scientific activities (e.g., field studies, laboratory experiments), they also serve as the foundation for secondary data research projects such as indicator development. Indicators and indices are a convenient way to synthesize disparate information to address complex scientific questions that are difficult to measure directly (e.g., resilience, sustainability, well-being). In the current literature, there is no shortage of indicator or index examples derived from secondary data with a growing number that are scientifically focused. However, little information is provided describing the management approaches and best practices used to govern the data underpinnings supporting these efforts. From acquisition to storage and maintenance, secondary data research products rely on the availability of relevant, high-quality data, repeatable data handling methods and a multi-faceted data flow process to promote and sustain research transparency and integrity. The U.S. Environmental Protection Agency recently published a report describing the development of a climate resilience screening index which used over one million data points to calculate the final index. The pool of data was derived exclusively from secondary sources such as the U.S. Census Bureau, Bureau of Labor Statistics, Postal Service, Housing and Urban Development, Forestry Services and others. Available data were presented in various forms including portable document format (PDF), delimited ASCII and proprietary format (e.g., Microsoft Excel, ESRI ArcGIS). The strategy employed for managing these data in an indicator research and development effort represented a blend of business practices, information science, and the scientific method. This paper describes the approach, highlighting key points unique for managing the data assets of a smaller scale research project in an era of "big data."<br />Competing Interests: Conflict of Interest Statement The authors declare that this research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Details

Language :
English
ISSN :
2296-665X
Volume :
7
Issue :
Article 72
Database :
MEDLINE
Journal :
Frontiers in environmental science
Publication Type :
Academic Journal
Accession number :
33123540
Full Text :
https://doi.org/10.3389/fenvs.2019.00072