Back to Search Start Over

Scientific papers citation analysis using textual features and SMOTE resampling techniques

Authors :
Muhammad Umer
Malik Muhammad Saad Missen
Saima Sadiq
Zahid Aslam
Muhammad Abubakar Siddique
Michele Nappi
Zahid Hameed
Publication Year :
2021

Abstract

Ascertaining the impact of research is significant for the research community and academia of all disciplines. The only prevalent measure associated with the quantification of research quality is the citation-count. Although a number of citations play a significant role in academic research, sometimes citations can be biased or made to discuss only the weaknesses and shortcomings of the research. By considering the sentiment of citations and recognizing patterns in text can aid in understanding the opinion of the peer research community and will also help in quantifying the quality of research articles. Efficient feature representation combined with machine learning classifiers has yielded significant improvement in text classification. However, the effectiveness of such combinations has not been analyzed for citation sentiment analysis. This study aims to investigate pattern recognition using machine learning models in combination with frequency-based and prediction-based feature representation techniques with and without using Synthetic Minority Oversampling Technique (SMOTE) on publicly available citation sentiment dataset. Sentiment of citation instances are classified into positive, negative or neutral. Results indicate that the Extra tree classifier in combination with Term Frequency-Inverse Document Frequency achieved 98.26% accuracy on the SMOTE-balanced dataset.

Details

Language :
English
Database :
OpenAIRE
Accession number :
edsair.doi.dedup.....8e20bb6ff512dd8cae88cfcf0a53cd88