Back to Search
Start Over
CiTIUS-COLE at SemEval-2019 Task 5: Combining Linguistic Features to Identify Hate Speech Against Immigrants and Women on Multilingual Tweets
- Source :
- SemEval@NAACL-HLT, Scopus-Elsevier
- Publication Year :
- 2019
- Publisher :
- Association for Computational Linguistics, 2019.
-
Abstract
- This article describes the strategy submitted by the CiTIUS-COLE team to SemEval 2019 Task 5, a task which consists of binary classi- fication where the system predicting whether a tweet in English or in Spanish is hateful against women or immigrants or not. The proposed strategy relies on combining linguis- tic features to improve the classifier’s perfor- mance. More precisely, the method combines textual and lexical features, embedding words with the bag of words in Term Frequency- Inverse Document Frequency (TF-IDF) repre- sentation. The system performance reaches about 81% F1 when it is applied to the training dataset, but its F1 drops to 36% on the official test dataset for the English and 64% for the Spanish language concerning the hate speech class
Details
- Database :
- OpenAIRE
- Journal :
- Proceedings of the 13th International Workshop on Semantic Evaluation
- Accession number :
- edsair.doi.dedup.....1f6fd9548992f69744cae6fd7beb0c0d
- Full Text :
- https://doi.org/10.18653/v1/s19-2068