1. What Should We Do about Source Selection in Event Data? Challenges, Progress, and Possible Solutions
- Author
-
Thomas V. Maher and J. Craig Jenkins
- Subjects
Selection bias ,021110 strategic, defence & security studies ,business.industry ,Event (computing) ,Computer science ,media_common.quotation_subject ,05 social sciences ,Big data ,0211 other engineering and technologies ,General Social Sciences ,Complex event processing ,02 engineering and technology ,Data science ,Representativeness heuristic ,Field (computer science) ,0506 political science ,050602 political science & public administration ,Econometrics ,The Internet ,business ,Selection (genetic algorithm) ,media_common - Abstract
The prospect of using the Internet and other Big Data methods to construct event data promises to transform the field but is stymied by the lack of a coherent strategy for addressing the problem of selection. Past studies have shown that event data have significant selection problems. In terms of conventional standards of representativeness, all event data have some unknown level of selection no matter how many sources are included. We summarize recent studies of news selection and outline a strategy for reducing the risks of possible selection bias, including techniques for generating multisource event inventories, estimating larger populations, and controlling for nonrandomness. These build on a relativistic strategy for addressing event selection and the recognition that no event data set can ever be declared completely free of selection bias.
- Published
- 2016
- Full Text
- View/download PDF