Back to Search Start Over

Drifted Twitter Spam Classification Using Multiscale Detection Test on K-L Divergence

Authors :
Xuesong Wang
Qi Kang
Jing An
Mengchu Zhou
Source :
IEEE Access, Vol 7, Pp 108384-108394 (2019)
Publication Year :
2019
Publisher :
IEEE, 2019.

Abstract

Twitter spam classification is a tough challenge for social media platforms and cyber security companies. Twitter spam with illegal links may evolve over time in order to deceive filtering models, causing disastrous loss to both users and the whole network. We define this distributional evolution as a concept drift scenario. To build an effective model, we adopt K-L divergence to represent spam distribution and use a multiscale drift detection test (MDDT) to localize possible drifts therein. A base classifier is then retrained based on the detection result to gain performance improvement. Comprehensive experiments show that K-L divergence has highly consistent change patterns between features when a drift occurs. Also, the MDDT is proved to be effective in improving final classification result in both accuracy, recall, and f-measure.

Details

Language :
English
ISSN :
21693536
Volume :
7
Database :
Directory of Open Access Journals
Journal :
IEEE Access
Publication Type :
Academic Journal
Accession number :
edsdoj.412edf1b3504c7680e9a564a40fe2ec
Document Type :
article
Full Text :
https://doi.org/10.1109/ACCESS.2019.2932018