Back to Search Start Over

Bounds on Performance for Recovery of Corrupted Labels in Supervised Learning: A Finite Query-Testing Approach.

Authors :
Seong, Jin-Taek
Source :
Mathematics (2227-7390). Sep2023, Vol. 11 Issue 17, p3636. 16p.
Publication Year :
2023

Abstract

Label corruption leads to a significant challenge in supervised learning, particularly in deep neural networks. This paper considers recovering a small corrupted subset of data samples which are typically caused by non-expert sources, such as automatic classifiers. Our aim is to recover the corrupted data samples by exploiting a finite query-testing system as an additional expert. The task involves identifying the corrupted data samples with minimal expert queries and finding them to their true label values. The proposed query-testing system uses a random selection of a subset of data samples and utilizes finite field operations to construct combined responses. In this paper, we demonstrate an information-theoretic lower bound on the minimum number of queries required for recovering corrupted labels. The lower bound can be represented as a function of joint entropy with an imbalanced rate of data samples and mislabeled probability. In addition, we find an upper bound on the error probability using maximum a posteriori decoding. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
22277390
Volume :
11
Issue :
17
Database :
Academic Search Index
Journal :
Mathematics (2227-7390)
Publication Type :
Academic Journal
Accession number :
171857825
Full Text :
https://doi.org/10.3390/math11173636