1. NELA-GT-2022: A Large Multi-Labelled News Dataset for The Study of Misinformation in News Articles
- Author
-
Gruppi, Maurício, Horne, Benjamin D., and Adalı, Sibel
- Subjects
Social and Information Networks (cs.SI) ,FOS: Computer and information sciences ,Computer Science - Computers and Society ,Computer Science - Machine Learning ,Computer Science - Computation and Language ,Computers and Society (cs.CY) ,Computer Science - Social and Information Networks ,Computation and Language (cs.CL) ,Machine Learning (cs.LG) - Abstract
In this paper, we present the fifth installment of the NELA-GT datasets, NELA-GT-2022. The dataset contains 1,778,361 articles from 361 outlets between January 1st, 2022 and December 31st, 2022. Just as in past releases of the dataset, NELA-GT-2022 includes outlet-level veracity labels from Media Bias/Fact Check and tweets embedded in collected news articles. The NELA-GT-2022 dataset can be found at: https://doi.org/10.7910/DVN/AMCV2H, Comment: Technical report documenting the NELA-GT recent update (NELA-GT-2022). arXiv admin note: substantial text overlap with arXiv:2102.04567
- Published
- 2022
- Full Text
- View/download PDF