Back to Search Start Over

Dataset of discourses about COVID-19 and financial markets from Twitter.

Authors :
Ngo VM
Source :
Data in brief [Data Brief] 2022 Jun 30; Vol. 43, pp. 108428. Date of Electronic Publication: 2022 Jun 30 (Print Publication: 2022).
Publication Year :
2022

Abstract

In this data article, a collection of 11,625,887 tweets on the topic of the COVID-19 pandemic are provided. The data from Twitter were collected through Twitter API from January 2020 to June 2020. In addition, we also provided subsets of tweets containing discourses on both COVID-19 and financial topics. In order to facilitate the research on sentiment analysis, the Sentiment140 dataset containing 1,600,000 tweets that were annotated as positive or negative sentiment was also provided (Go et al., 2009) We used Term Frequency-Inverse Document Frequency (TF-IDF) algorithm to transform documents to numeric vectors and used logistic regression classifier to train and predict sentiments of tweets. These datasets may garner interest from data science, economists, social science, natural language processing, epidemiology, and public health groups.<br />Competing Interests: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.<br /> (© 2022 The Author(s). Published by Elsevier Inc.)

Details

Language :
English
ISSN :
2352-3409
Volume :
43
Database :
MEDLINE
Journal :
Data in brief
Publication Type :
Academic Journal
Accession number :
35818354
Full Text :
https://doi.org/10.1016/j.dib.2022.108428