Back to Search Start Over

Likelihood corpus distribution: an efficient topic modelling scheme for Bengali document class identification.

Authors :
Das Dawn, Debapratim
Khan, Abhinandan
Shaikh, Soharab Hossain
Pal, Rajat Kumar
Source :
Sādhanā: Academy Proceedings in Engineering Sciences. Sep2024, Vol. 49 Issue 3, p1-19. 19p.
Publication Year :
2024

Abstract

The learning quality of humans depends on the sense of contemplation. Textual documents are a huge part of the literature on contemplation which effortlessly creates perception. Automatic document class identification or organisation is a machine learning function to understand the psychological and emotional content of the text in a concise way. The problem of identification of documents falls in the field of library science, information science and artificial intelligence. The research progress of class identification of documents has been made in various most spoken languages. Numerous research works have been published in European and Asian languages. However, there is a gap in the literature when it comes to any less resource language, especially Bengali. Consequently, this work portrays an efficient topic modelling approach for Bengali document class identification. It proposes a Dirichlet-polynomial clustering model likelihood corpus distribution (LCD), which is based on a Bayesian numerical prototype. Experiments are done to prove the efficiency of LCD over various topic modelling algorithms, such as latent Dirichlet allocation (LDA), LDA with bag-of-words (LDA-BOW), latent semantic indexing (LSI), and hierarchical Dirichlet process (HDP). For performance evaluation, we considered five real-world datasets of Bengali corpora, such as science, sports, computer, season, and epic in this work. The coherence score of different modelling algorithms is compared to find the best model for each dataset separately. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02562499
Volume :
49
Issue :
3
Database :
Academic Search Index
Journal :
Sādhanā: Academy Proceedings in Engineering Sciences
Publication Type :
Academic Journal
Accession number :
178527531
Full Text :
https://doi.org/10.1007/s12046-024-02470-7