Back to Search
Start Over
Online estimation of individual-level effects using streaming shrinkage factors
- Source :
- Computational Statistics & Data Analysis, 137, 16-32. Elsevier, Computational Statistics & Data Analysis, 137, 16-32. Elsevier Science
- Publication Year :
- 2019
-
Abstract
- It has become increasingly easy to collect data from individuals over long periods of time. Examples include smart-phone applications used to track movements with GPS, web-log data tracking individuals' browsing behavior, and longitudinal (cohort) studies where many individuals are monitored over an extensive period of time. All these datasets cover a large number of individuals and collect data on the same individuals repeatedly, causing a nested structure in the data. Moreover, the data collection is never 'finished' as new data keep streaming in. It is well known that predictions that use the data of the individual whose individual-level effect is predicted in combination with the data of all the other individuals, are better in terms of squared error than those that just use the individual mean. However, when data are both nested and streaming, and the outcome variable is binary, computing these individual-level predictions can be computationally challenging. Five computationally-efficient estimation methods which do not revise "old" data but do account for the nested data structure are developed and evaluated. The methods are based on existing shrinkage factors. A shrinkage factor is used to predict an individual-level effect (i.e., the probability to score a 1), by weighing the individual mean and the mean over all data points. The performance of the existing and newly developed shrinkage factors are compared in a simulation study. While the existing methods differ in their prediction accuracy, the differences in accuracy between the novel shrinkage factors and the existing methods are extremely small. The novel methods are however computationally much more appealing. (C) 2019 Elsevier B.V. All rights reserved.
- Subjects :
- Statistics and Probability
Concept drift
Mean squared error
James-Stein estimator
Computer science
James–Stein estimator
Inference
computer.software_genre
01 natural sciences
010104 statistics & probability
CONCEPT DRIFT
0502 economics and business
0101 mathematics
050205 econometrics
Shrinkage
Data collection
Data stream mining
Applied Mathematics
MULTILEVEL
05 social sciences
Computational Mathematics
Data point
Shrinkage factors
Computational Theory and Mathematics
Online learning
INFERENCE
Data mining
Nested data
computer
Data streams
Subjects
Details
- Language :
- English
- ISSN :
- 01679473
- Volume :
- 137
- Database :
- OpenAIRE
- Journal :
- Computational Statistics & Data Analysis
- Accession number :
- edsair.doi.dedup.....27d9c530d8fa2254eea45c2d5a3763dd