Back to Search Start Over

Should supervised discretisation always be trusted unreservedly? On combining characteristics of supervised and unsupervised discretisation algorithms in two-step processing.

Authors :
Stańczyk, Urszula
Baron, Grzegorz
Source :
Procedia Computer Science; 2023, Vol. 225, p2136-2145, 10p
Publication Year :
2023

Abstract

The paper presents a description of the research methodology dedicated to a two-step discretisation process applied to the input numeric data, with combining the characteristics of selected supervised and unsupervised algorithms, which leads to extended processing of some attributes in train and test sets. The methodology was illustrated with the investigations carried out in the domain of stylometric analysis of texts, for two datasets prepared for the task of binary authorship attribution. The several variants of transformed input data obtained were subjected to exploration using two selected machine learning methods capable of inducing knowledge from both continuous and categorical forms, namely the PART and J48 classifiers. The results from the experiments indicate that, as can be expected, supervised transformations of data work well enough, however, they do not always return the best outcome. The two-step processing of some attributes shows sufficient promise to warrant a closer study, as opposed to always unconditionally relying only on supervised algorithms as outperforming all other approaches. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
18770509
Volume :
225
Database :
Supplemental Index
Journal :
Procedia Computer Science
Publication Type :
Academic Journal
Accession number :
174059256
Full Text :
https://doi.org/10.1016/j.procs.2023.10.204