1. Towards comprehensive cyberbullying detection: A dataset incorporating aggressive texts, repetition, peerness, and intent to harm.
- Author
-
Ejaz, Naveed, Razi, Fakhra, and Choudhury, Salimur
- Subjects
- *
AFFINITY groups , *VIOLENCE , *MACHINE learning , *ACCESS to information , *AUTOMATION , *CYBERBULLYING , *TEXT messages , *INTENTION - Abstract
The increasing usage of social media networks has raised concerns about the growing frequency of cyberbullying incidents. The definition of cyberbullying lacks universal consensus, yet according to several authors, cyberbullying is characterized by aggressive, repetitive, and intentional communication among peers. However, existing cyberbullying detection datasets often focus solely on classifying texts as aggressive or non-aggressive, neglecting the other cyberbullying aspects, thus hindering research progress. This paper proposes a framework for designing a new dataset incorporating all four aspects of cyberbullying to address this gap. The text messages are sourced from a real dataset, while the users' data is generated synthetically. The resulting dataset contains messages exchanged randomly among different pairs of users, thus inculcating repetition. Additionally, the degree of peerness, defined and calculated to measure the likelihood of two users being peers, is used. The intent of harm is quantified as a numeric value using the ratios of aggression and repetition. As a result, the proposed dataset encompasses all four aspects of cyberbullying by providing repeated aggressive messages among users along with quantitative values of the degree of peerness and intent to harm. The proposed dataset is adaptable, with adjustable threshold values for peerness, repetition, and intent to harm, offering flexibility for various applications. The paper concludes by presenting the results of some baseline machine-learning methods on the proposed dataset. • Generation of a semi-synthetic dataset for automatic cyberbullying detection. • The dataset includes all aspects of cyberbullying: aggression, repetition, intent to harm, and occurrence among peers. • Presentation of baseline results on the proposed dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF