Back to Search
Start Over
A novel Tag Score (T_S) model with improved K-means for clustering tweets.
- Source :
-
Sādhanā: Academy Proceedings in Engineering Sciences . 2020, Vol. 45 Issue 1, p1-12. 12p. - Publication Year :
- 2020
-
Abstract
- Clustering of tweets is useful for analyzing the attitudes of people towards a particular product. The companies can use this analysis to modify their products to meet the needs of people. Recently, K-means clustering is widely used to cluster the tweets with bag of words as a feature set. The key factors contributing to the quality of clusters and performance of clustering are dimensionality reduction and initial selection of centroids. This paper addresses these issues using a newly proposed Tag Score (T_S) model with improved K-means in which semantically similar features from bag of words are grouped into tags, scores are modified based on sentiment polarity values and the initial centroids are selected with the help of sentiment scores. The performance of the proposed T_S model with improved K-means is compared with T_S model with random K-means and conventional word vectors with random K-means by considering three labeled datasets and three unlabeled datasets. The results show that the proposed method produces significant results in approximately 70% of the cases in terms of purity, F-measure, intra-cluster distance and inter-cluster distance. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 02562499
- Volume :
- 45
- Issue :
- 1
- Database :
- Academic Search Index
- Journal :
- Sādhanā: Academy Proceedings in Engineering Sciences
- Publication Type :
- Academic Journal
- Accession number :
- 143238645
- Full Text :
- https://doi.org/10.1007/s12046-020-01359-5