1. Profiling effects of filtering noise labels on learning performance.
- Author
-
Wu, ChienHsing, Kao, Shu-Chen, Hong, Rui-Qian, and Chen, Li-Hui
- Abstract
• A prototype system is developed to profile outcomes of 50 filtering acceptance levels (FAL). • Experiments on 27 unsupervised and 29 supervised granulated datasets are conducted. • An algorithm is developed to separate noise labels using 50 FALS. • Experimental results using five classification-oriented classifiers are presented. • Effects of 50 FALs on classification performance are profiled and summarized. Deciding whether a label in a dataset is noisy or not depends on factors such as filtering levels, data patterns, relabeling policies, and pre-defined regulations. By focusing on filtering acceptance levels (FAL), a decision support model (named DSM-ENL) is proposed and implemented on real datasets that are granulated, dealing with noise label detection and profiling effects of 50 FALs (51 %–100 %) on noise rate (NoR), classification accuracy (CA), and area under ROC curve (AUC). A case used to demonstrate the DSM-ENL is illustrated, followed by experiments on 27 unsupervised and 29 supervised granulated datasets using five classifiers with 70 %–30 % training and testing criteria. The findings revealed that: (1) the best CAs among classifiers vary with FALs; (2) NoR, CA, and AUC oscillate as FAL increases; (3) lower FAL boundaries (from 0.51 to 0.60) are more likely to lead to higher CAs; and (4) correlation of coefficient among dataset characteristics, FAL, NoR, CA, and AUC are not high compared to that of CA and NoR and that of CA and AUC. The research demonstrates the value of profiling the effects of FALs applied for noise-label datasets on learning performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF