Back to Search
Start Over
Classification of toxic comments unified through diverse internet forums.
- Source :
-
AIP Conference Proceedings . 2023, Vol. 2754 Issue 1, p1-6. 6p. - Publication Year :
- 2023
-
Abstract
- In the last half-decade, India has seen exponential growth in the Internet and social media. This huge growth resulted in better communication among friends and families and freely spread information, content, opinions, and ideas. Some users misusethis freedom and make social media platforms intolerable. The magnitude of detrimental content online, such as toxic comments or content, is not manageable by humans. This study creates a homogeneous dataset by manually labelling comments taken from social platforms and combining them with some publicly available datasets. We have classified them into two category labels, toxic and non-toxic. This work presents our unified dataset, including a wide spectrum of comments and an approach to classify Hinglish comments using the BERT transformer model. The study also includes training baseline models and depicting their performance based on selected evaluation criteria. The BERT model outperformed the baseline and other models trained on the unified dataset. This study gives importance to Hinglish Comments and provides an implementation for classifying them to make internet platform much more secure and friendly for regional language users. [ABSTRACT FROM AUTHOR]
- Subjects :
- *LANGUAGE models
*SOCIAL media
*INTERNET forums
*CLASSIFICATION
Subjects
Details
- Language :
- English
- ISSN :
- 0094243X
- Volume :
- 2754
- Issue :
- 1
- Database :
- Academic Search Index
- Journal :
- AIP Conference Proceedings
- Publication Type :
- Conference
- Accession number :
- 171390447
- Full Text :
- https://doi.org/10.1063/5.0169608