1. Scientific papers citation analysis using textual features and SMOTE resampling techniques.
- Author
-
Umer, Muhammad, Sadiq, Saima, Missen, Malik Muhammad Saad, Hameed, Zahid, Aslam, Zahid, Siddique, Muhammad Abubakar, and NAPPI, Michele
- Subjects
- *
CITATION analysis , *CONTENT analysis , *MACHINE learning , *SENTIMENT analysis , *PATTERN recognition systems , *USER-generated content - Abstract
• Explore qualitative aspects of citations to measure the influence of a research article. • Apply a feature representation technique in combination with machine learning models to find the sentiment of citation. • Determine the sentiment of citation instances into positive, negative, or neutral. • Analyze the efficacy of SMOTE in balancing the citation sentiment dataset. Ascertaining the impact of research is significant for the research community and academia of all disciplines. The only prevalent measure associated with the quantification of research quality is the citation-count. Although a number of citations play a significant role in academic research, sometimes citations can be biased or made to discuss only the weaknesses and shortcomings of the research. By considering the sentiment of citations and recognizing patterns in text can aid in understanding the opinion of the peer research community and will also help in quantifying the quality of research articles. Efficient feature representation combined with machine learning classifiers has yielded significant improvement in text classification. However, the effectiveness of such combinations has not been analyzed for citation sentiment analysis. This study aims to investigate pattern recognition using machine learning models in combination with frequency-based and prediction-based feature representation techniques with and without using Synthetic Minority Oversampling Technique (SMOTE) on publicly available citation sentiment dataset. Sentiment of citation instances are classified into positive, negative or neutral. Results indicate that the Extra tree classifier in combination with Term Frequency-Inverse Document Frequency achieved 98.26% accuracy on the SMOTE-balanced dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF