Back to Search Start Over

Topic Segmentation using Transformer Model for Indonesian Text.

Authors :
Sonata, Ilvico
Heryadi, Yaya
Tho, Cuk
Source :
Procedia Computer Science; 2023, Vol. 227, p159-167, 9p
Publication Year :
2023

Abstract

The increasing number of articles available in the digital library will require quite a long time and accuracy in sorting articles according to needs. The use of artificial neural networks in finding the right articles as needed through topic segmentation applications will helps this process in terms of speed and accuracy. Several neural network models used in segmentation topic applications include Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM). The Transformer model since its introduction in 2017 for Natural Language Processing (NLP) has a better level of accuracy compared to the RNN, CNN and LSTM models. Transformer model research for segmentation topic applications, especially articles in Indonesian language, is still very limited. This paper will discuss the use of the Transformer model in segmentation topic applications for Indonesian-language articles. The experimental results found that the accuracy produced by the Transformer model was higher than previous LSTM model with the WindowDiff value generated by the model proposed using Transformer is 0.249 and the LSTM baseline model is 0.363, while the while the Pk value generated by the proposed model is 0.279 and the LSTM baseline model is 0.394. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
18770509
Volume :
227
Database :
Supplemental Index
Journal :
Procedia Computer Science
Publication Type :
Academic Journal
Accession number :
173853914
Full Text :
https://doi.org/10.1016/j.procs.2023.10.513