Back to Search
Start Over
A Permutation-Based Model for Crowd Labeling: Optimal Estimation and Robustness.
- Source :
-
IEEE Transactions on Information Theory . Jun2021, Vol. 67 Issue 6, p4162-4184. 23p. - Publication Year :
- 2021
-
Abstract
- The task of aggregating and denoising crowd-labeled data has gained increased significance with the advent of crowdsourcing platforms and massive datasets. We propose a permutation-based model for crowd labeled data that is a significant generalization of the classical Dawid-Skene model, and introduce a new error metric by which to compare different estimators. We derive global minimax rates for the permutation-based model that are sharp up to logarithmic factors, and match the minimax lower bounds derived under the simpler Dawid-Skene model. We then design two computationally-efficient estimators: the WAN estimator for the setting where the ordering of workers in terms of their abilities is approximately known, and the OBI- WAN estimator where that is not known. For each of these estimators, we provide non-asymptotic bounds on their performance. We conduct synthetic simulations and experiments on real-world crowdsourcing data, and the experimental results corroborate our theoretical findings. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 00189448
- Volume :
- 67
- Issue :
- 6
- Database :
- Academic Search Index
- Journal :
- IEEE Transactions on Information Theory
- Publication Type :
- Academic Journal
- Accession number :
- 150448695
- Full Text :
- https://doi.org/10.1109/TIT.2020.3045613