Back to Search Start Over

A Comprehensive Evaluation of Metadata-Based Features to Classify Research Paper’s Topics

Authors :
Ghulam Mustafa
Muhammad Usman
Muhammad Tanvir Afzal
Abdul Shahid
Anis Koubaa
Source :
IEEE Access, Vol 9, Pp 133500-133509 (2021)
Publication Year :
2021
Publisher :
IEEE, 2021.

Abstract

The existing plethora of document classification techniques exploits different data sources either from the content or metadata of research articles. Various journal publishers like Springer, Elsevier, IEEE, etc., do not provide open access to the content of research articles, whereas metadata is freely available there. Metadata like title, keyword, and abstract can serve as a better alternative to the content in various scenarios. In the current literature, researchers have assessed the role of some of the metadata individually. We believe that the collective contribution of metadata parameters can play a significant role in classifying research papers. This paper presents a comprehensive evaluation of the role of metadata, individually as well as in combinations to achieve the objective of research paper classification. Moreover, we have classified the research articles into ACM hierarchy root categories (e.g. general literature, hardware, software, etc.). In this comprehensive evaluation, we have assessed all the possible combinations of metadata features against different classifiers such as Random Forest, K Nearest Neighbor, and Decision Tree. The results of this research reveal that the title & keywords combination outperforms other combinations with an F-measure score of 0.88.

Details

Language :
English
ISSN :
21693536
Volume :
9
Database :
Directory of Open Access Journals
Journal :
IEEE Access
Publication Type :
Academic Journal
Accession number :
edsdoj.3a8915ab641a472ba162e25ed57084e0
Document Type :
article
Full Text :
https://doi.org/10.1109/ACCESS.2021.3115148