Back to Search
Start Over
Construction of Text Summarization Corpus in Economics Domain and Baseline Models.
- Source :
- Journal of Information & Communication Convergence Engineering; Mar2024, Vol. 22 Issue 1, p33-43, 11p
- Publication Year :
- 2024
-
Abstract
- Automated text summarization (ATS) systems rely on language resources as datasets. However, creating these datasets is a complex and labor-intensive task requiring linguists to extensively annotate the data. Consequently, certain public datasets for ATS, particularly in languages such as Thai, are not as readily available as those for the more popular languages. The primary objective of the ATS approach is to condense large volumes of text into shorter summaries, thereby reducing the time required to extract information from extensive textual data. Owing to the challenges involved in preparing language resources, publicly accessible datasets for Thai ATS are relatively scarce compared to those for widely used languages. The goal is to produce concise summaries and accelerate the information extraction process using vast amounts of textual input. This study introduced ThEconSum, an ATS architecture specifically designed for Thai language, using economy-related data. An evaluation of this research revealed the significant remaining tasks and limitations of the Thai language. [ABSTRACT FROM AUTHOR]
- Subjects :
- TEXT summarization
NATURAL language processing
DATA mining
TRANSFORMER models
Subjects
Details
- Language :
- English
- ISSN :
- 22348255
- Volume :
- 22
- Issue :
- 1
- Database :
- Complementary Index
- Journal :
- Journal of Information & Communication Convergence Engineering
- Publication Type :
- Academic Journal
- Accession number :
- 176346464
- Full Text :
- https://doi.org/10.56977/jicce.2024.22.1.33