Back to Search
Start Over
Understanding the effects of dichotomization of continuous outcomes on geostatistical inference
- Publication Year :
- 2021
-
Abstract
- Diagnosis is often based on the exceedance or not of continuous health indicators of a predefined cut-off value, so as to classify patients into positives and negatives for the disease under investigation. In this paper, we investigate the effects of dichotomization of spatially-referenced continuous outcome variables on geostatistical inference. Although this issue has been extensively studied in other fields, dichotomization is still a common practice in epidemiological studies. Furthermore, the effects of this practice in the context of prevalence mapping have not been fully understood. Here, we demonstrate how spatial correlation affects the loss of information due to dichotomization, how linear geostatistical models can be used to map disease prevalence and thus avoid dichotomization, and finally, how dichotomization affects our predictive inference on prevalence. To pursue these objectives, we develop a metric, based on the composite likelihood, which can be used to quantify the potential loss of information after dichotomization without requiring the fitting of Binomial geostatistical models. Through a simulation study and two applications on disease mapping in Africa, we show that, as thresholds used for dichotomization move further away from the mean of the underlying process, the performance of binomial geostatistical models deteriorates substantially. We also find that dichotomization can lead to the loss of fine scale features of disease prevalence and increased uncertainty in the parameter estimates, especially in the presence of a large noise to signal ratio. These findings strongly support the conclusions from previous studies that dichotomization should be always avoided whenever feasible.<br />Comment: 18 pages, 5 figures, to be published in the journal of Spatial Statistics
- Subjects :
- Statistics and Probability
FOS: Computer and information sciences
Quasi-maximum likelihood
Computer science
0208 environmental biotechnology
Inference
Context (language use)
02 engineering and technology
Management, Monitoring, Policy and Law
01 natural sciences
Statistics - Applications
Outcome (probability)
020801 environmental engineering
Methodology (stat.ME)
010104 statistics & probability
Predictive inference
Statistics
Applications (stat.AP)
Metric (unit)
0101 mathematics
Computers in Earth Sciences
Scale (map)
Statistics - Methodology
Subjects
Details
- Language :
- English
- ISSN :
- 22116753
- Database :
- OpenAIRE
- Accession number :
- edsair.doi.dedup.....b6ccf44c32d3add9800ce5d7c31d92f2
- Full Text :
- https://doi.org/10.1016/j.spasta.2020.100424