Back to Search Start Over

Study on the Domain Adaption of Korean Speech Act using Daily Conversation Dataset and Petition Corpus

Authors :
Youngsook Song
Won Ik Cho
Source :
Journal of Data Mining and Digital Humanities, Vol NLP4DH, Iss Dataset (2024)
Publication Year :
2024
Publisher :
Nicolas Turenne, 2024.

Abstract

In Korean, quantitative speech act studies have usually been conducted on single utterances with unspecified sources. In this study, we annotate sentences from the National Institute of Korean Language's Messenger Corpus and the National Petition Corpus, as well as example sentences from an academic paper on contemporary Korean vlogging, and check the discrepancy between human annotation and model prediction. In particular, for sentences with differences in locutionary and illocutionary forces, we analyze the causes of errors to see if stylistic features used in a particular domain affect the correct inference of speech act. Through this, we see the necessity to build and analyze a balanced corpus in various text domains, taking into account cases with different usage roles, e.g., messenger conversations belonging to private conversations and petition corpus/vlogging script that have an unspecified audience.

Details

Language :
English
ISSN :
24165999
Volume :
NLP4DH
Issue :
Dataset
Database :
Directory of Open Access Journals
Journal :
Journal of Data Mining and Digital Humanities
Publication Type :
Academic Journal
Accession number :
edsdoj.ffdc6c2185704653bde7c90e33f516dc
Document Type :
article
Full Text :
https://doi.org/10.46298/jdmdh.13145