1. Small2BERT for extractive text summarization.
- Author
-
Tantius, Cornelius, Shintaro, Chrismorgan, Soelistio, Elizabeth Ann, Kristanto, Jonathan, Nelson, Rico, and Girsang, Abba Suganda
- Subjects
- *
AUTOMATIC summarization , *LANGUAGE models , *TEXT summarization , *NATURAL language processing , *WORD frequency , *MACHINE learning - Abstract
Bidirectional Encoder Representations from Transformers (BERT) is one method technique in machine learning for solving various natural language processing (NLP) problems, including summarization. One general problem using BERT is getting much time and resources to train the model. This paper aims to implement the small architecture BERT, called Small2BERT, which is expected to summarize the data. Therefore, this study compares pre-trained Small2BERT and Summarization using Word Frequency (SWF) in extracting text summaries. This research uses the dataset of Indian News Summary, which includes news articles from the Hindu, Indian Times, and Guardian from February to August 2017. Small2BERT surpasses SWF in precision, whereas SWF gives a higher score in overall high f1-score. Small2BERT earns a 0.307 f1-score in Rogue-1, compared to 0.35 for SWF. Small2BERT scores 0.19 and 0.304 while SWF achieves 0.304 and 0.31 in f1-score for Rogue-2 and Rogue-L, respectively. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF