Start Over

Regularization oversampling for classification tasks: To exploit what you do not know.

Authors :: Van der Schraelen, Lennert
Stouthuysen, Kristof
Vanden Broucke, Seppe
Verdonck, Tim
Source :: Information Sciences. Jul2023, Vol. 635, p169-194. 26p.
Publication Year :: 2023
Abstract: In numerous binary classification tasks, the two groups of instances are not equally represented, which often implies that the training data lack sufficient information to model the minority class correctly. Furthermore, many traditional classification models make arbitrarily overconfident predictions outside the range of the training data. These issues severely impact the deployment and usefulness of these models in real life. In this paper, we propose the boundary regularizing out-of-distribution (BROOD) sampler, which adds artificial data points on the edge of the training data. By exploiting these artificial samples, we are able to regularize the decision surface of discriminative machine learning models and make more prudent predictions. Next, it is crucial to correctly classify many positive instances in a limited pool of instances that can be investigated with the available resources. By smartly assigning predetermined nonuniform class probabilities outside the training data, we can emphasize certain data regions and improve classifier performance on various material classification metrics. The good performance of the proposed methodology is illustrated in a case study that consists of both benchmark balanced and imbalanced classification data sets. [ABSTRACT FROM AUTHOR]

Subjects :: *INFORMATION modeling
*MACHINE learning
*CLASSIFICATION

Details

Language :: English
ISSN :: 00200255
Volume :: 635
Database :: Academic Search Index
Journal :: Information Sciences
Publication Type :: Periodical
Accession number :: 163228002
Full Text :: https://doi.org/10.1016/j.ins.2023.03.146

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Regularization oversampling for classification tasks: To exploit what you do not know.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Regularization oversampling for classification tasks: To exploit what you do not know.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources