1. Badgers: generating data quality deficits with Python
- Author
-
Siebert, Julien, Seifert, Daniel, Kelbert, Patricia, Kläs, Michael, and Trendowicz, Adam
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,D.m ,Machine Learning (cs.LG) - Abstract
Generating context specific data quality deficits is necessary to experimentally assess data quality of data-driven (artificial intelligence (AI) or machine learning (ML)) applications. In this paper we present badgers, an extensible open-source Python library to generate data quality deficits (outliers, imbalanced data, drift, etc.) for different modalities (tabular data, time-series, text, etc.). The documentation is accessible at https://fraunhofer-iese.github.io/badgers/ and the source code at https://github.com/Fraunhofer-IESE/badgers, Comment: 17 pages, 16 figures more...
- Published
- 2023
- Full Text
- View/download PDF