Back to Search
Start Over
A globally synthesised and flagged bee occurrence dataset and cleaning workflow
- Source :
- Scientific Data, Vol 10, Iss 1, Pp 1-17 (2023)
- Publication Year :
- 2023
- Publisher :
- Nature Portfolio, 2023.
-
Abstract
- Abstract Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeBDC R-workflow. Specifically, we harmonised species names (following established global taxonomy), country names, and collection dates and, we added record-level flags for a series of potential quality issues. These data are provided in two formats, “cleaned” and “flagged-but-uncleaned”. The BeeBDC package with online documentation provides end users the ability to modify filtering parameters to address their research questions. By publishing reproducible R workflows and globally cleaned datasets, we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation.
- Subjects :
- Science
Subjects
Details
- Language :
- English
- ISSN :
- 20524463
- Volume :
- 10
- Issue :
- 1
- Database :
- Directory of Open Access Journals
- Journal :
- Scientific Data
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.5784f31ed3324a47afe7a017378508ec
- Document Type :
- article
- Full Text :
- https://doi.org/10.1038/s41597-023-02626-w