Back to Search Start Over

Panini: a transformer-based grammatical error correction method for Bangla.

Authors :
Hossain, Nahid
Bijoy, Mehedi Hasan
Islam, Salekul
Shatabda, Swakkhar
Source :
Neural Computing & Applications. Mar2024, Vol. 36 Issue 7, p3463-3477. 15p.
Publication Year :
2024

Abstract

The purpose of the Bangla grammatical error correction task is to spontaneously identify and correct syntactic, morphological, semantic, and punctuation mistakes in written Bangla text using computational models, ultimately enhancing language precision and eloquence. The significance of the task encompasses bolstering linguistic acumen, fostering efficacious communication, and ensuring utmost lucidity and meticulousness in written expression, thereby mitigating the potential for obfuscation or dissemination of fallacious connotations. Prior endeavors have centered around surmounting the constraints inherent in rule-based and statistical methods through the exploration of machine learning and deep learning methods, aiming to enhance accuracy by apprehending intricate linguistic patterns, comprehending contextual cues, and discerning semantic nuances. In this study, we address the absence of a baseline for the task by developing a large-scale parallel corpus comprising 7.7M source-target pairs and exploring the untapped potential of transformers. Alongside the corpus, we introduce a Vaswani-style efficient monolingual transformer-based method named Bangla grammatical error corrector, Panini by leveraging transfer learning, which has become the state-of-the-art method for the task by surpassing the performance of both BanglaT5 and T5-Small by 18.81% and 23.8% of accuracy scores, and 11.5 and 15.6 of SacreBLEU scores, respectively. The empirical findings of the method substantiate its superiority over other approaches when it comes to capturing intricate linguistic rules and patterns. Moreover, the efficacy of our proposed method has been compared with the Bangla paraphrase task, showcasing its superior capability by outperforming the previous state-of-the-art method for the task as well. The BanglaGEC corpus and Panini, along with the baselines of BGEC and the Bangla paraphrase task, have been made publicly accessible at https://tinyurl.com/BanglaGEC. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09410643
Volume :
36
Issue :
7
Database :
Academic Search Index
Journal :
Neural Computing & Applications
Publication Type :
Academic Journal
Accession number :
175359169
Full Text :
https://doi.org/10.1007/s00521-023-09211-7