Back to Search Start Over

Content Based Spam Text Classification: An Empirical Comparison between English and Chinese

Authors :
Yichuan Wang
Jianfeng Ma
Liumei Zhang
Source :
INCoS
Publication Year :
2013
Publisher :
IEEE, 2013.

Abstract

Spam text including e-mails, SMS and etc, is a real and growing problem primarily due to the availability of digital handset and internet. To filter spam text is to be the utmost topic over varies study area. Text bodies of different forms of communication expose channel for spammers. In this study, text dataset in English and Chinese are pre-processed. Classical classifiers are applied on the pre-processed dataset to evaluate the accuracy of the same classifier. The behavior of classifiers among English and Chinese is evaluated. The paper also discussed the result of experiments. In addition, different from most existing text spam detection methods which are based on English, classifiers suited for English text classification is insufficient for Chinese text classification.

Details

Database :
OpenAIRE
Journal :
2013 5th International Conference on Intelligent Networking and Collaborative Systems
Accession number :
edsair.doi...........54736257aec5def3126739b1df52210a
Full Text :
https://doi.org/10.1109/incos.2013.21