Back to Search
Start Over
Participant identification in genetic association studies: improved methods and practical implications.
- Source :
- International Journal of Epidemiology; Dec2011, Vol. 40 Issue 6, p1629-1642, 14p
- Publication Year :
- 2011
-
Abstract
- <bold>Background: </bold>In a recent paper by Homer et al. (Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genet 2008;4:e1000167), a method for detecting whether a given individual is a contributor to a particular genomic mixture was proposed. This prompted grave concern about the public dissemination of aggregate statistics from genome-wide association studies. It is of clear scientific importance that such data be shared widely, but the confidentiality of study participants must not be compromised. The issue of what summary genomic data can safely be posted on the web is only addressed satisfactorily when the theoretical underpinnings of the proposed method are clarified and its performance evaluated in terms of dependence on underlying assumptions.<bold>Methods: </bold>The original method raised a number of concerns and several alternatives have since been proposed, including a simple linear regression approach. In our proposed generalized estimating equation approach, we maintain the simplicity of the linear regression model but obtain inferences that are more robust to approximation of the variance/covariance structure and can accommodate linkage disequilibrium.<bold>Results: </bold>We affirm that, in principle, it is possible to determine that a 'candidate' individual has participated in a study, given a subset of aggregate statistics from that study. However, the methods depend critically on a number of key factors including: the ancestry of participants in the study; the absolute and relative numbers of cases and controls; and the number of single nucleotide polymorphisms.<bold>Conclusions: </bold>Simple guidelines for publication that are based on a single criterion are therefore unlikely to suffice. In particular, 'directed' summary statistics should not be posted openly on the web but could be protected by an internet-based access check as proposed by the P3G_Consortium et al. (Public access to genome-wide data: five views on balancing research with privacy and protection. PLoS Genet 2009;5:e1000665). [ABSTRACT FROM AUTHOR]
- Subjects :
- DNA
GENETIC polymorphisms
DNA microarrays
STATISTICS
REGRESSION analysis
LINKAGE disequilibrium
IDENTIFICATION
CASE-control method
MEDICAL research ethics
COMPARATIVE studies
EXPERIMENTAL design
GENETIC techniques
LONGITUDINAL method
RESEARCH methodology
MEDICAL cooperation
RESEARCH
RESEARCH funding
RESEARCH ethics
EVALUATION research
GENETIC privacy
GENOTYPES
Subjects
Details
- Language :
- English
- ISSN :
- 03005771
- Volume :
- 40
- Issue :
- 6
- Database :
- Complementary Index
- Journal :
- International Journal of Epidemiology
- Publication Type :
- Academic Journal
- Accession number :
- 69709118
- Full Text :
- https://doi.org/10.1093/ije/dyr149