Back to Search Start Over

A Comprehensive Evaluation of Metadata-Based Features to Classify Research Paper’s Topics

Authors :
Muhammad Usman
Ghulam Mustafa
Muhammad Afzal
Anis Koubaa
Abdul Shahid
Source :
IEEE Access. 9:133500-133509
Publication Year :
2021
Publisher :
Institute of Electrical and Electronics Engineers (IEEE), 2021.

Abstract

The existing plethora of document classification techniques exploits different data sources either from the content or metadata of research articles. Various journal publishers like Springer, Elsevier, IEEE, etc., do not provide open access to the content of research articles, whereas metadata is freely available there. Metadata like title, keyword, and abstract can serve as a better alternative to the content in various scenarios. In the current literature, researchers have assessed the role of some of the metadata individually. We believe that the collective contribution of metadata parameters can play a significant role in classifying research papers. This paper presents a comprehensive evaluation of the role of metadata, individually as well as in combinations to achieve the objective of research paper classification. Moreover, we have classified the research articles into ACM hierarchy root categories (e.g. general literature, hardware, software, etc.). In this comprehensive evaluation, we have assessed all the possible combinations of metadata features against different classifiers such as Random Forest, K Nearest Neighbor, and Decision Tree. The results of this research reveal that the title & keywords combination outperforms other combinations with an F-measure score of 0.88.

Details

ISSN :
21693536
Volume :
9
Database :
OpenAIRE
Journal :
IEEE Access
Accession number :
edsair.doi...........4a9ef4f209697a132717b9cb7c842741