Back to Search Start Over

Improving in-text citation reason extraction and classification using supervised machine learning techniques.

Authors :
Ihsan, Imran
Rahman, Hameedur
Shaikh, Asadullah
Sulaiman, Adel
Rajab, Khairan
Rajab, Adel
Source :
Computer Speech & Language. Jul2023, Vol. 82, pN.PAG-N.PAG. 1p.
Publication Year :
2023

Abstract

In the last decade, automatic extraction and classification of in-text citations have received immense popularity and have become one of the most frequently used techniques to evaluate research. Due to the large volume of in-text citations in various digital libraries such as Web of Science, Scopus, Google Scholar, Microsoft Academic, etc., machine learning models and natural language processing techniques are being used to extract, classify, and analyze them. Typical automatic in-text classification techniques use sentiment-based classes (Positive, Negative, and Neutral). However, there are cognitive-based schemes as well that classify in-text citations based on the author's perspective. In such schemes, extracting citation reasons with high recall is challenging. To address this challenge, we have used eight citations' context and reason classes defined by CCRO (Citation's Context and Reasons Ontology) to develop a machine learning model to achieve high recall without compromising on precision. We have worked on Association for Computational Linguistics Corpus with over 7000 in-text citations, randomly annotated by experts in CCRO classes. Afterwards, an array of machine-learning models is implemented on the annotated dataset: Support Vector Machine (SVM), Naïve Bayesian (NB), and Random Forest (RF). We have used various part-of-speech (Nouns, Verbs, Adverbs, and Adjectives) as novel features. Our results show that we have outperformed the three comparative models by achieving 91% accuracy. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
08852308
Volume :
82
Database :
Academic Search Index
Journal :
Computer Speech & Language
Publication Type :
Academic Journal
Accession number :
164855572
Full Text :
https://doi.org/10.1016/j.csl.2023.101526