Back to Search
Start Over
Online Multilingual Hate Speech Detection: Experimenting with Hindi and English Social Media
- Source :
- Information, Volume 12, Issue 1, Information, Vol 12, Iss 5, p 5 (2021)
- Publication Year :
- 2020
- Publisher :
- Multidisciplinary Digital Publishing Institute, 2020.
-
Abstract
- The last two decades have seen an exponential increase in the use of the Internet and social media, which has changed basic human interaction. This has led to many positive outcomes. At the same time, it has brought risks and harms. The volume of harmful content online, such as hate speech, is not manageable by humans. The interest in the academic community to investigate automated means for hate speech detection has increased. In this study, we analyse six publicly available datasets by combining them into a single homogeneous dataset. Having classified them into three classes, abusive, hateful or neither, we create a baseline model and improve model performance scores using various optimisation techniques. After attaining a competitive performance score, we create a tool that identifies and scores a page with an effective metric in near-real-time and uses the same feedback to re-train our model. We prove the competitive performance of our multilingual model in two languages, English and Hindi. This leads to comparable or superior performance to most monolingual models.
- Subjects :
- 050101 languages & linguistics
text classification
Computer science
social media
hate speech
02 engineering and technology
computer.software_genre
020204 information systems
0202 electrical engineering, electronic engineering, information engineering
artificial_intelligence_robotics
0501 psychology and cognitive sciences
Social media
Hindi
Voice activity detection
lcsh:T58.5-58.64
business.industry
lcsh:Information technology
05 social sciences
Baseline model
language.human_language
Homogeneous
language
Academic community
The Internet
Metric (unit)
Artificial intelligence
business
computer
Natural language processing
Information Systems
Subjects
Details
- Language :
- English
- ISSN :
- 20782489
- Database :
- OpenAIRE
- Journal :
- Information
- Accession number :
- edsair.doi.dedup.....786dc6ad003a07c7469ebb9724f56ebb
- Full Text :
- https://doi.org/10.3390/info12010005