1. Data Science Kitchen at GermEval 2021: A Fine Selection of Hand-Picked Features, Delivered Fresh from the Oven
- Author
-
Hildebrandt, Niclas, Boenninghoff, Benedikt, Orth, Dennis, and Schymura, Christopher
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
This paper presents the contribution of the Data Science Kitchen at GermEval 2021 shared task on the identification of toxic, engaging, and fact-claiming comments. The task aims at extending the identification of offensive language, by including additional subtasks that identify comments which should be prioritized for fact-checking by moderators and community managers. Our contribution focuses on a feature-engineering approach with a conventional classification backend. We combine semantic and writing style embeddings derived from pre-trained deep neural networks with additional numerical features, specifically designed for this task. Classifier ensembles are used to derive predictions for each subtask via a majority voting scheme. Our best submission achieved macro-averaged F1-scores of 66.8\%,\,69.9\% and 72.5\% for the identification of toxic, engaging, and fact-claiming comments., Comment: Accepted at 17th Conference on Natural Language Processing (KONVENS 2021)
- Published
- 2021