Back to Search
Start Over
머신러닝 모델의 성능 저하 완화를 위한 반복적 결측값 처리 기법.
- Source :
- Journal of the Korea Institute of Information & Communication Engineering; Apr2024, Vol. 28 Issue 4, p387-394, 8p
- Publication Year :
- 2024
-
Abstract
- Machine learning models find extensive application across diverse domains, with their performance heavily reliant on the data quality employed during the learning process. However, real-world datasets include some missing data due to limitations and errors in data collection methods, incomplete or inconsistent data-gathering processes, and human errors during processing. Consequently, effective handling of missing values becomes imperative to ensure optimal model performance. A common way to deal with missing data is to either delete the data containing the missing values or to impute them appropriately. Deletion is straightforward, but at the cost of information loss. Imputation, on the other hand, can result in a loss of variability in the dataset and skewed correlations between variables. The proposed scheme reduces dimensionality by utilizing variables without missing values and employs the outcomes to estimate the missing values. Experimental validations affirm that the proposed scheme mitigates the performance degradation of various machine learning models compared to existing methods. [ABSTRACT FROM AUTHOR]
Details
- Language :
- Korean
- ISSN :
- 22344772
- Volume :
- 28
- Issue :
- 4
- Database :
- Complementary Index
- Journal :
- Journal of the Korea Institute of Information & Communication Engineering
- Publication Type :
- Academic Journal
- Accession number :
- 177021027
- Full Text :
- https://doi.org/10.6109/jkiice.2024.28.4.387