Small2BERT for extractive text summarization.

Authors :: Tantius, Cornelius
Shintaro, Chrismorgan
Soelistio, Elizabeth Ann
Kristanto, Jonathan
Nelson, Rico
Girsang, Abba Suganda
Source :: AIP Conference Proceedings. 3/26/2024, Vol. 2927 Issue 1, p1-5. 5p.
Publication Year :: 2024
Abstract: Bidirectional Encoder Representations from Transformers (BERT) is one method technique in machine learning for solving various natural language processing (NLP) problems, including summarization. One general problem using BERT is getting much time and resources to train the model. This paper aims to implement the small architecture BERT, called Small2BERT, which is expected to summarize the data. Therefore, this study compares pre-trained Small2BERT and Summarization using Word Frequency (SWF) in extracting text summaries. This research uses the dataset of Indian News Summary, which includes news articles from the Hindu, Indian Times, and Guardian from February to August 2017. Small2BERT surpasses SWF in precision, whereas SWF gives a higher score in overall high f1-score. Small2BERT earns a 0.307 f1-score in Rogue-1, compared to 0.35 for SWF. Small2BERT scores 0.19 and 0.304 while SWF achieves 0.304 and 0.31 in f1-score for Rogue-2 and Rogue-L, respectively. [ABSTRACT FROM AUTHOR]