Back to Search
Start Over
Small margin ensembles can be robust to class-label noise
- Source :
- Biblos-e Archivo. Repositorio Institucional de la UAM, instname
- Publication Year :
- 2015
- Publisher :
- Elsevier BV, 2015.
-
Abstract
- This is the author’s version of a work that was accepted for publication in Neurocomputing. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Neurocomputing, VOL 160 (2015) DOI 10.1016/j.neucom.2014.12.086<br />Subsampling is used to generate bagging ensembles that are accurate and robust to class-label noise. The effect of using smaller bootstrap samples to train the base learners is to make the ensemble more diverse. As a result, the classification margins tend to decrease. In spite of having small margins, these ensembles can be robust to class-label noise. The validity of these observations is illustrated in a wide range of synthetic and real-world classification tasks. In the problems investigated, subsampling significantly outperforms standard bagging for different amounts of class-label noise. By contrast, the effectiveness of subsampling in random forest is problem dependent. In these types of ensembles the best overall accuracy is obtained when the random trees are built on bootstrap samples of the same size as the original training data. Nevertheless, subsampling becomes more effective as the amount of class-label noise increases.<br />The authors acknowledge financial support from Spanish Plan Nacional I+D+i Grant TIN2013-42351-P and from Comunidad de Madrid Grant S2013/ICE-2845 CASI-CAM-CM.
- Subjects :
- Informática
Small margin classifiers
Training set
business.industry
Computer science
Cognitive Neuroscience
Contrast (statistics)
Pattern recognition
Base (topology)
Class (biology)
Computer Science Applications
Random forest
Noise
Label noise
Artificial Intelligence
Margin (machine learning)
Bagging
Range (statistics)
Artificial intelligence
business
Bootstrapping (statistics)
Bootstrap sampling
Subjects
Details
- ISSN :
- 09252312
- Volume :
- 160
- Database :
- OpenAIRE
- Journal :
- Neurocomputing
- Accession number :
- edsair.doi.dedup.....71c083e3cdf48efc2f8617149a5debca
- Full Text :
- https://doi.org/10.1016/j.neucom.2014.12.086