Back to Search Start Over

Word Embedding for Rhetorical Sentence Categorization on Scientific Articles

Authors :
Ghoziyah Haitan Rachman
Masayu Leylia Khodra
Dwi Hendratmo Widyantoro
Source :
Journal of ICT Research and Applications, Vol 12, Iss 2 (2018)
Publication Year :
2018
Publisher :
ITB Journal Publisher, 2018.

Abstract

A common task in summarizing scientific articles is employing the rhetorical structure of sentences. Determining rhetorical sentences itself passes through the process of text categorization. In order to get good performance, some works in text categorization have been done by employing word embedding. This paper presents rhetorical sentence categorization of scientific articles by using word embedding to capture semantically similar words. A comparison of employing Word2Vec and GloVe is shown. First, two experiments are evaluated using five classifiers, namely Naïve Bayes, Linear SVM, IBK, J48, and Maximum Entropy. Then, the best classifier from the first two experiments was employed. This research showed that Word2Vec CBOW performed better than Skip-Gram and GloVe. The best experimental result was from Word2Vec CBOW for 20,155 resource papers from ACL-ARC, features from Teufel and the previous label feature. In this experiment, Linear SVM produced the highest F-measure performance at 43.44%.

Details

Language :
English
ISSN :
23375787 and 23385499
Volume :
12
Issue :
2
Database :
Directory of Open Access Journals
Journal :
Journal of ICT Research and Applications
Publication Type :
Academic Journal
Accession number :
edsdoj.9e39707f97de4046b42d1da3a79d1bf6
Document Type :
article
Full Text :
https://doi.org/10.5614/itbj.ict.res.appl.2018.12.2.5