Back to Search Start Over

Authorship Attribution in Bangla literature using Character-level CNN

Authors :
Khatun, Aisha
Rahman, Anisur
Islam, Md. Saiful
Marium-E-Jannat
Khatun, Aisha
Rahman, Anisur
Islam, Md. Saiful
Marium-E-Jannat
Publication Year :
2020

Abstract

Characters are the smallest unit of text that can extract stylometric signals to determine the author of a text. In this paper, we investigate the effectiveness of character-level signals in Authorship Attribution of Bangla Literature and show that the results are promising but improvable. The time and memory efficiency of the proposed model is much higher than the word level counterparts but accuracy is 2-5% less than the best performing word-level models. Comparison of various word-based models is performed and shown that the proposed model performs increasingly better with larger datasets. We also analyze the effect of pre-training character embedding of diverse Bangla character set in authorship attribution. It is seen that the performance is improved by up to 10% on pre-training. We used 2 datasets from 6 to 14 authors, balancing them before training and compare the results.<br />Comment: 5 pages

Details

Database :
OAIster
Publication Type :
Electronic Resource
Accession number :
edsoai.on1228386041
Document Type :
Electronic Resource