1. Study of violence against women and its characteristics through the application of text mining techniques
- Author
-
Stephanie, E. M. A., Ruiz, L. G. B., Vila, M. A., and Pegalajar, M. C.
- Abstract
The Internet provides a wide variety of information that can be collected and studied, creating a massive data repository. Among the data available on the Internet, we can find articles about Violence against Women (VAW) published in the digital press, which are of great societal interest. In this work, we utilized Web scraping techniques to gather VAW-related news from the internet. Applying Text Mining techniques, we conducted a study on VAW and its characteristics. Our work comprises an exploratory analysis and the application of Topic Modelling to VAW events to identify latent topics and their semantic structures. We employed classification algorithms on a set of VAW press articles to determine the type of violence they refer to, namely physical, psychological, sexual, or a combination of them. We proposed two methodologies to target the data: the first one is based on dictionaries of VAW types, while the second approach extends the former by using the predominant violence to identify other associated types. Furthermore, we implemented two feature selection techniques: TF-IDF and Chi2. Then, we applied Support Vector Machine, Decision Tree, Bayesian Networks, XGBoost Classifier, Random Forest, and Artificial Neural Networks. The results obtained showed that the classifiers achieved better performance when using Chi2. The Boost Classifier demonstrated the best performance, followed by Random Forest.
- Published
- 2024
- Full Text
- View/download PDF