1. Sampling from networks: respondent-driven sampling.
- Author
-
Yauck, Mamadou, Moodie, Erica E.M., Apelian, Herak, Peet, Marc-Messier, Lambert, Gilles, Grace, Daniel, Lachowsky, Nathan J., Hart, Trevor A., and Cox, Joseph
- Subjects
- *
SOCIAL networks , *CROSS-sectional method , *DESCRIPTIVE statistics , *STATISTICAL sampling , *MEN who have sex with men , *STATISTICAL models , *GAY people - Abstract
Respondent-Driven Sampling (RDS) is a variant of link-tracing, a sampling technique for surveying hard-to-reach communities that takes advantage of community members' social networks to reach potential participants. While the RDS sampling mechanism and associated methods of adjusting for the sampling at the analysis stage are well-documented in the statistical sciences literature, methodological focus has largely been restricted to estimation of population means and proportions, while giving little to no consideration to the estimation of population network parameters. As a network-based sampling method, RDS is faced with the fundamental problem of sampling from population networks where features such as homophily (the tendency for individuals with similar traits to share social ties) and differential activity (the ratio of the average number of connections by attribute) are sensitive to the choice of a sampling method. Many simple approaches exist to generate simulated RDS data, with specific levels of network features (mainly homophily and differential activity), where the focus is on estimating means and proportions (Gile 2011; Gile et al. 2015; Spiller et al. 2018). However, recent findings on the inconsistency of estimators of network features such as homophily in partially observed networks (Crawford et al. 2017; Shalizi and Rinaldo 2013) raise the question of whether those target features can be recovered using the observed RDS data alone – as recovering information about these features is critical if we wish to condition upon them. In this paper, we conduct a simulation study to assess the accuracy of existing RDS simulation methods, in terms of their abilities to generate RDS samples with the desired levels of two network parameters: homophily and differential activity. The results show that (1) homophily cannot be consistently estimated from simulated RDS samples and (2) differential activity estimators are more precise when groups, defined by traits, are equally active and equally represented in the population. We use this approach to mimic features of the Engage Study, an RDS sample of gay, bisexual and other men who have sex with men in Montréal, Canada. In this paper, we highlight that it is possible, in some cases, to simulate population networks by mimicking the characteristics of real-world RDS data while retaining accuracy and precision for target network features in the samples. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF