Back to Search Start Over

Enhanced TextNetTopics for Text Classification Using the G-S-M Approach with Filtered fastText-Based LDA Topics and RF-Based Topic Scoring: fasTNT.

Authors :
Voskergian, Daniel
Jayousi, Rashid
Yousef, Malik
Source :
Applied Sciences (2076-3417); Oct2024, Vol. 14 Issue 19, p8914, 24p
Publication Year :
2024

Abstract

TextNetTopics is a novel topic modeling-based topic selection approach that finds highly ranked discriminative topics for training text classification models, where a topic is a set of semantically related words. However, it suffers from several limitations, including the retention of redundant or irrelevant features within topics, a computationally intensive topic-scoring mechanism, and a lack of explicit semantic modeling. In order to address these shortcomings, this paper proposes fasTNT, an enhanced version of TextNetTopics grounded in the Grouping–Scoring–Modeling approach. FasTNT aims to improve the topic selection process by preserving only informative features within topics, reforming LDA topics using fastText word embeddings, and introducing an efficient scoring method that considers topic interactions using Random Forest feature importance. Experimental results on four diverse datasets demonstrate that fasTNT outperforms the original TextNetTopics method in classification performance and feature reduction. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
20763417
Volume :
14
Issue :
19
Database :
Complementary Index
Journal :
Applied Sciences (2076-3417)
Publication Type :
Academic Journal
Accession number :
180273531
Full Text :
https://doi.org/10.3390/app14198914