Back to Search Start Over

TABHATE: A Target-based hate speech detection dataset in Hindi

Authors :
Sharma, Deepawali
Singh, Vivek Kumar
Gupta, Vedika
Source :
Social Network Analysis and Mining; December 2024, Vol. 14 Issue: 1
Publication Year :
2024

Abstract

Social media has become a platform for expressing opinions and emotions, but some people also use it to spread hate, targeting individuals, groups, communities, or countries. Therefore, there is a need to identify such content and take corrective action. During the last few years, several techniques have been developed to automatically detect and identify hate speech, offensive and abusive content from social media platforms. However, majority of the studies focused on hate speech detection in English language texts only. The non-availability of suitable datasets is a major reason for lack of research work in other languages. Hindi is one such widely spoken language where such datasets are not available. This work attempts to bridge this research gap by presenting a curated and annotated dataset for target-based hate speech (TABHATE) in the Hindi language. The suitability of the dataset is explored by applying some standard deep learning and transformer-based models for the task of hate speech detection. The experimental results obtained show that the dataset can be used for experimental work on hate speech detection of Hindi language texts.

Details

Language :
English
ISSN :
18695450 and 18695469
Volume :
14
Issue :
1
Database :
Supplemental Index
Journal :
Social Network Analysis and Mining
Publication Type :
Periodical
Accession number :
ejs67455238
Full Text :
https://doi.org/10.1007/s13278-024-01355-1