Back to Search Start Over

Keyword Extraction – Comparison of Latent Dirichlet Allocation and Latent Semantic Analysis

Authors :
Bhuvaneshwari Kondeti
Jyothirani S. A
Haragopal V. V
Source :
European Journal of Mathematics and Statistics. 3:40-47
Publication Year :
2022
Publisher :
European Open Science Publishing, 2022.

Abstract

The main aim of the present study is to compare the keywords extracted from abstracts and full length text of scientific research papers. In addition to that, here, we compare Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) to identify better performer for keyword extraction. This comparative study is divided into three levels, In the first level, scientific research articles on topics such as Indian Economic growth, GDP, Economic Slowdown etc. were collected and abstracts and full length text was extracted from the sources and pre-processed to remove the words and characters which were not useful to obtain the semantic structures or necessary patterns to make the meaningful corpus. In the second level, the pre-processed data were converted into a bag of words and numerical statistic TF-IDF (Term Frequency – Inverse Document Frequency) is used to assess how relevant a word is to a document in a corpus. In the third level, in order to study the feasibility of the Natural Language Processing (NLP) techniques, Latent Semantic analysis (LSA) and Latent Dirichlet Allocations (LDA) methods were applied over the resultant corpus.

Details

ISSN :
27365484
Volume :
3
Database :
OpenAIRE
Journal :
European Journal of Mathematics and Statistics
Accession number :
edsair.doi...........f3747613e4e02f329fe598d8f353241d
Full Text :
https://doi.org/10.24018/ejmath.2022.3.3.119