Back to Search Start Over

Optimization on machine learning based approaches for sentiment analysis on HPV vaccines related tweets.

Authors :
Jingcheng Du
Jun Xu
Hsingyi Song
Xiangyu Liu
Cui Tao
Source :
Journal of Biomedical Semantics; 3/3/2017, Vol. 8, p1-7, 7p
Publication Year :
2017

Abstract

Background: Analysing public opinions on HPV vaccines on social media using machine learning based approaches will help us understand the reasons behind the low vaccine coverage and come up with corresponding strategies to improve vaccine uptake. Objective: To propose a machine learning system that is able to extract comprehensive public sentiment on HPV vaccines on Twitter with satisfying performance. Method: We collected and manually annotated 6,000 HPV vaccines related tweets as a gold standard. SVM model was chosen and a hierarchical classification method was proposed and evaluated. Additional feature sets evaluation and model parameters optimization was done to maximize the machine learning model performance. Results: A hierarchical classification scheme that contains 10 categories was built to access public opinions toward HPV vaccines comprehensively. A 6,000 annotated tweets gold corpus with Kappa annotation agreement at 0.851 was created and made public available. The hierarchical classification model with optimized feature sets and model parameters has increased the micro-averaging and macro-averaging F score from 0.6732 and 0.3967 to 0.7442 and 0.5883 respectively, compared with baseline model. Conclusions: Our work provides a systematical way to improve the machine learning model performance on the highly unbalanced HPV vaccines related tweets corpus. Our system can be further applied on a large tweets corpus to extract large-scale public opinion towards HPV vaccines. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
20411480
Volume :
8
Database :
Complementary Index
Journal :
Journal of Biomedical Semantics
Publication Type :
Academic Journal
Accession number :
121675245
Full Text :
https://doi.org/10.1186/s13326-017-0120-6