1. ILA4: Overcoming missing values in machine learning datasets โ An inductive learning approach
- Author
-
Firas Alghanim, Saleh M. Abu-Soud, Walid A. Salameh, and Ammar Elhassan
- Subjects
General Computer Science ,Series (mathematics) ,business.industry ,Computer science ,020206 networking & telecommunications ,02 engineering and technology ,Missing data ,Logistic regression ,Machine learning ,computer.software_genre ,Random forest ,Naive Bayes classifier ,0202 electrical engineering, electronic engineering, information engineering ,Common value auction ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Completeness (statistics) ,computer - Abstract
This article introduces ILA4: A new algorithm designed to handle datasets with missing values. ILA4 is inspired by a series of ILA algorithms which also handle missing data with further enhancements. ILA4 is applied to datasets with varying completeness and also compared to other, known approaches for handling datasets with missing values. In the majority of cases, ILA4 produced favorable performance that is on a par with many established approaches for treating missing values including algorithms that are based on the Most Common Value (MCV), the Most Common Value Restricted to a Concept (MCVRC), and those that utilize the Delete strategy. ILA4 was also compared with three known algorithms namely: Logistic Regression, Naive Bayes, and Random Forest; the accuracy obtained by ILA4 is comparable or better than the best results obtained from these three algorithms.
- Published
- 2022
- Full Text
- View/download PDF