Back to Search
Start Over
Parallel dynamic topic modeling via evolving topic adjustment and term weighting scheme
- Source :
- Information Sciences. 585:176-193
- Publication Year :
- 2022
- Publisher :
- Elsevier BV, 2022.
-
Abstract
- The parallel Hierarchical Dirichlet Process (pHDP) is an efficient topic model which explores the equivalence of the generation process between Hierarchical Dirichlet Process (HDP) and Gamma-Gamma-Poisson Process (G2PP), in order to achieve parallelism at the topic level. Unfortunately, pHDP loses the non-parametric feature of HDP, i.e., the number of topics in pHDP is predetermined and fixed. Furthermore, under the bootstrap structure of pHDP, the topic-indiscriminate words are of high probabilities to be assigned to different topics, resulting in poor qualities of the extracted topics. To achieve parallelism without sacrificing the non-parametric feature of HDP, in addition to improve the quality of extracted topics, we propose a parallel dynamic topic model by developing an adjustment mechanism of evolving topics and reducing the sampling probabilities of topic-indiscriminate words. Both supervised and unsupervised experiments on benchmark datasets show the competitive performance of our model.
- Subjects :
- Hierarchical Dirichlet process
Topic model
Information Systems and Management
Computer science
computer.software_genre
Computer Science Applications
Theoretical Computer Science
Dynamic topic model
Weighting
ComputingMethodologies_PATTERNRECOGNITION
Artificial Intelligence
Control and Systems Engineering
Benchmark (computing)
Feature (machine learning)
Parallelism (grammar)
Data mining
computer
Equivalence (measure theory)
Software
Subjects
Details
- ISSN :
- 00200255
- Volume :
- 585
- Database :
- OpenAIRE
- Journal :
- Information Sciences
- Accession number :
- edsair.doi...........001358d8b44342a72863e7eb3c793e96