Start Over

Combining data discretization and missing value imputation for incomplete medical datasets.

Authors :: Huang, Min-Wei
Tsai, Chih-Fong
Tsui, Shu-Ching
Lin, Wei-Chao
Source :: PLoS ONE. 11/30/2023, Vol. 18 Issue 11, p1-15. 15p.
Publication Year :: 2023
Abstract: Data discretization aims to transform a set of continuous features into discrete features, thus simplifying the representation of information and making it easier to understand, use, and explain. In practice, users can take advantage of the discretization process to improve knowledge discovery and data analysis on medical domain problem datasets containing continuous features. However, certain feature values were frequently missing. Many data-mining algorithms cannot handle incomplete datasets. In this study, we considered the use of both discretization and missing-value imputation to process incomplete medical datasets, examining how the order of discretization and missing-value imputation combined influenced performance. The experimental results were obtained using seven different medical domain problem datasets: two discretizers, including the minimum description length principle (MDLP) and ChiMerge; three imputation methods, including the mean/mode, classification and regression tree (CART), and k-nearest neighbor (KNN) methods; and two classifiers, including support vector machines (SVM) and the C4.5 decision tree. The results show that a better performance can be obtained by first performing discretization followed by imputation, rather than vice versa. Furthermore, the highest classification accuracy rate was achieved by combining ChiMerge and KNN with SVM. [ABSTRACT FROM AUTHOR]

Subjects :: *MISSING data (Statistics)
*MULTIPLE imputation (Statistics)
*DATA mining
*SUPPORT vector machines
*K-nearest neighbor classification
*REGRESSION trees
*DECISION trees

Details

Language :: English
ISSN :: 19326203
Volume :: 18
Issue :: 11
Database :: Academic Search Index
Journal :: PLoS ONE
Publication Type :: Academic Journal
Accession number :: 173949835
Full Text :: https://doi.org/10.1371/journal.pone.0295032

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Combining data discretization and missing value imputation for incomplete medical datasets.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Combining data discretization and missing value imputation for incomplete medical datasets.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources