1. Evaluation of Logistic Regression Applied to Respondent-Driven Samples: Simulated and Real Data
- Author
-
Sperandei, Sandro, Bastos, Leonardo S., Ribeiro-Alves, Marcelo, Reis, Arianne, and Bastos, Francisco I.
- Subjects
Statistics - Methodology - Abstract
Objective: To investigate the impact of different logistic regression estimators applied to RDS samples obtained by simulation and real data. Methods: Four simulated populations were created combining different connectivity models, levels of clusterization and infection processes. Each subject in the population received two attributes, only one of them related to the infection process. From each population, RDS samples with different sizes were obtained. Similarly, RDS samples were obtained from a real-world dataset. Three logistic regression estimators were applied to assess the association between the attributes and the infection status, and subsequently the observed coverage of each was measured. Results: The type of connectivity had more impact on estimators performance than the clusterization level. In simulated datasets, unweighted logistic regression estimators emerged as the best option, although all estimators showed a fairly good performance. In the real dataset, the performance of weighted estimators presented some instabilities, making them a risky option. Conclusion: An unweighted logistic regression estimator is a reliable option to be applied to RDS samples, with similar performance to random samples and, therefore, should be the preferred option., Comment: 24 pages, 8 figures, 1 table
- Published
- 2021
- Full Text
- View/download PDF