1. Classification of Persian News Articles using Machine Learning Techniques
- Author
-
Sareh Mostafavi, Bahareh Pahlevanzadeh, and Mohammad Reza Falahati Qadimi Fumani
- Subjects
automatic persian text classification ,k-nearest neighbor ,naïve bayes ,news text classification ,text mining ,Computer software ,QA76.75-76.765 - Abstract
Automatic text classification, which is defined as the process of automatically classifying texts into predefined categories, has many applications in our everyday life and it has recently gained much attention due to the in-creased number of text documents available in electronic form. Classifying News articles is one of the applications of text classification. Automatic classification is a subset of machine learning techniques in which a classifier is built by learning from some pre-classified documents. Naïve Bayes and k-Nearest Neighbor are among the most common algorithms of machine learning for text classification. In this paper, we suggest a way to improve the performance of a text classifier using Mutual information and Chi-square feature selection algorithms. We have observed that MI feature selection method can improve the accuracy of Naïve Bayes classifier up to 10%. Experimental results show that the proposed model achieves an average accuracy of 80% and an average F1-measure of 80%.
- Published
- 2021
- Full Text
- View/download PDF