Back to Search
Start Over
Multi-Label Classification of Research Papers Using Multi-Label K-Nearest Neighbour Algorithm
- Source :
- Journal of Physics: Conference Series. 1994:012031
- Publication Year :
- 2021
- Publisher :
- IOP Publishing, 2021.
-
Abstract
- With the frequent interaction and cooperation between different disciplines in recent years, the number of research papers associated with multiple subjects increased. Correspondingly, some of the existing literatures belong to a single discipline, while others may simultaneously involve more than 2 subjects. At this time, the traditional single-label text classification is not conducive to people obtaining comprehensive and cutting-edge research papers in real life. Thus, it’s of great importance to conduct a multi-label classification of research papers effectively. This paper tests the performance of multi-label learning tasks with text data obtained from the Kaggle website. Firstly, lemmatization and Term Frequency-Inverse Document Frequency (TF-IDF) are used for feature extraction in the pre-processing part. The critical information of text content is statistically analysed, and text content is converted into numerical and high-dimensional vector space. As the traditional single-label classification algorithm is not suitable for the above problem, this paper adopts the Multi-Label K-Nearest Neighbour (ML-KNN) algorithm framework for classification. Experimental results report that the ML-KNN algorithm has achieved better results in multi-label text classification problems than a traditional multi-label algorithm, which proves the effectiveness of the ML-KNN algorithm for text data prediction with multiple subjects. Moreover, the work in this paper is analysed and summarized.
Details
- ISSN :
- 17426596 and 17426588
- Volume :
- 1994
- Database :
- OpenAIRE
- Journal :
- Journal of Physics: Conference Series
- Accession number :
- edsair.doi...........af31b8e160da812cee138e68774a56fa