Back to Search
Start Over
Affine combination‐based over‐sampling for imbalanced regression.
- Source :
-
Journal of Chemometrics . Mar2024, Vol. 38 Issue 3, p1-22. 22p. - Publication Year :
- 2024
-
Abstract
- Imbalanced domain prediction analysis is currently one of the hot research topics. Many real‐world data mining analyses involve using imbalanced data to obtain predictive models. In the context of imbalance, research on classification problems has been extensive, but research on regression problems is negligible. Rare values rarely occur in imbalanced regression problems, but the focus is on accurately predicting the continuous target variables of rare instances. One of the challenges in imbalanced regression is finding a suitable strategy to rebalance the original dataset in order to improve the predictive performance of the model in rare instances. In this study, two algorithms are proposed: sigma nearest over‐sampling based on convex combination for regression (SNOCCR) and affine combination‐based over‐sampling (ACOS). ACOS rebalances the original dataset by generating new instances through the affine combinations of the original examples. The region where the new instances are generated can be adjusted based on the distribution of the data, ensuring that the generated cases better mimic the distribution of the original examples. The comparison among ACOS, SNOCCR, and other preprocessing methods was conducted on 15 datasets to validate the predictive performance of models trained on rebalanced datasets for rare instances. The experimental results indicate that ACOS outperforms other existing methods. In this study, two over‐sampling algorithms are proposed: sigma nearest over‐sampling based on convex combination for regression (SNOCCR) and affine combination‐based over‐sampling (ACOS) in order to improve the predictive performance of the model in imbalanced regression. The experimental results indicate that ACOS outperforms other existing methods. [ABSTRACT FROM AUTHOR]
- Subjects :
- *PREDICTION models
*DATA distribution
*DATA mining
Subjects
Details
- Language :
- English
- ISSN :
- 08869383
- Volume :
- 38
- Issue :
- 3
- Database :
- Academic Search Index
- Journal :
- Journal of Chemometrics
- Publication Type :
- Academic Journal
- Accession number :
- 175945835
- Full Text :
- https://doi.org/10.1002/cem.3537