Back to Search Start Over

Automatic Product Classification Using Supervised Machine Learning Algorithms in Price Statistics.

Authors :
Oancea, Bogdan
Source :
Mathematics (2227-7390). Apr2023, Vol. 11 Issue 7, p1588. 32p.
Publication Year :
2023

Abstract

Modern approaches to computing consumer price indices include the use of various data sources, such as web-scraped data or scanner data, which are very large in volume and need special processing techniques. In this paper, we address one of the main problems in the consumer price index calculation, namely the product classification, which cannot be performed manually when using large data sources. Therefore, we conducted an experiment on automatic product classification according to an international classification scheme. We combined 9 different word-embedding techniques with 13 classification methods with the aim of identifying the best combination in terms of the quality of the resultant classification. Because the dataset used in this experiment was significantly imbalanced, we compared these methods not only using the accuracy, F1-score, and AUC, but also using a weighted F1-score that better reflected the overall classification quality. Our experiment showed that logistic regression, support vector machines, and random forests, combined with the FastText skip-gram embedding technique provided the best classification results, with superior values in performance metrics, as compared to other similar studies. An execution time analysis showed that, among the three mentioned methods, logistic regression was the fastest while the random forest recorded a longer execution time. We also provided per-class performance metrics and formulated an error analysis that enabled us to identify methods that could be excluded from the range of choices because they provided less reliable classifications for our purposes. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
22277390
Volume :
11
Issue :
7
Database :
Academic Search Index
Journal :
Mathematics (2227-7390)
Publication Type :
Academic Journal
Accession number :
163042938
Full Text :
https://doi.org/10.3390/math11071588