1. Classification of Events in Selected Industrial Processes Using Weighted Key Words and K-Nearest Neighbors Algorithm
- Author
-
Mateusz Walczak, Aneta Poniszewska-Marańda, and Krzysztof Stepień
- Subjects
text classification ,keyword extraction ,text-feature extraction ,K-NN algorithm ,N-gram method ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
The problem of classifying events in the industry is related to a large amount of accumulated text data including, among others, communication between the company and the client, whose expectations regarding the quality of its service are constantly growing. The currently used solutions for handling incoming requests have numerous disadvantages; they imply additional costs for the company and often a high level of customer dissatisfaction. A partial solution to this problem may be the automation of event classification; for example, by means of an expert IT system. The presented work proposes the solution to the problem of classifying text events. For this purpose, textual descriptions of events were used, which were collected for many years by companies from many different industries. A large part of text events are various types of problems reported by company customers. As part of this work, a complex text-classification process was constructed by using the K-Nearest Neighbors algorithm. The demonstrated classification process uses two novel proposed mechanisms: the dynamic extension of stop list and weighted keywords. Both of the mechanisms aim to improve the classification performance by solving typical problems that occur when using a fixed stop list and a classical keyword extraction approach by using TF or TF-IDF methods. Finally, the Text Events Categorizer system that implements the proposed classification process was described.
- Published
- 2023
- Full Text
- View/download PDF