Back to Search
Start Over
Identifying and exploiting trait-relevant tissues with multiple functional annotations in genome-wide association studies
- Source :
- PLoS Genetics, Vol 14, Iss 1, p e1007186 (2018), PLoS Genetics
- Publication Year :
- 2018
- Publisher :
- Public Library of Science (PLoS), 2018.
-
Abstract
- Genome-wide association studies (GWASs) have identified many disease associated loci, the majority of which have unknown biological functions. Understanding the mechanism underlying trait associations requires identifying trait-relevant tissues and investigating associations in a trait-specific fashion. Here, we extend the widely used linear mixed model to incorporate multiple SNP functional annotations from omics studies with GWAS summary statistics to facilitate the identification of trait-relevant tissues, with which to further construct powerful association tests. Specifically, we rely on a generalized estimating equation based algorithm for parameter inference, a mixture modeling framework for trait-tissue relevance classification, and a weighted sequence kernel association test constructed based on the identified trait-relevant tissues for powerful association analysis. We refer to our analytic procedure as the Scalable Multiple Annotation integration for trait-Relevant Tissue identification and usage (SMART). With extensive simulations, we show how our method can make use of multiple complementary annotations to improve the accuracy for identifying trait-relevant tissues. In addition, our procedure allows us to make use of the inferred trait-relevant tissues, for the first time, to construct more powerful SNP set tests. We apply our method for an in-depth analysis of 43 traits from 28 GWASs using tissue-specific annotations in 105 tissues derived from ENCODE and Roadmap. Our results reveal new trait-tissue relevance, pinpoint important annotations that are informative of trait-tissue relationship, and illustrate how we can use the inferred trait-relevant tissues to construct more powerful association tests in the Wellcome trust case control consortium study.<br />Author summary Identifying trait-relevant tissues is an important step towards understanding disease etiology. Computational methods have been recently developed to integrate SNP functional annotations generated from omics studies to genome-wide association studies (GWASs) to infer trait-relevant tissues. However, two important questions remain to be answered. First, with the increasing number and types of functional annotations nowadays, how do we integrate multiple annotations jointly into GWASs in a trait-specific fashion? Doing so would allow us to take advantage of the complementary information contained in these annotations to optimize the performance of trait-relevant tissue inference. Second, what to do with the inferred trait-relevant tissues? Here, we develop a new statistical method and software to make progress on both fronts. For the first question, we extend the commonly used linear mixed model, with new algorithms and inference strategies, to incorporate multiple annotations in a trait-specific fashion to improve trait-relevant tissue inference accuracy. For the second question, we rely on the close relationship between our proposed method and the widely-used sequence kernel association test, and use the inferred trait-relevant tissues, for the first time, to construct more powerful association tests. We illustrate the benefits of our method through extensive simulations and applications to a wide range of real data sets.
- Subjects :
- 0301 basic medicine
Cancer Research
Genomics Statistics
Databases, Factual
Computer science
Normal Distribution
Test Statistics
Social Sciences
Inference
Genome-wide association study
Biochemistry
Histones
0302 clinical medicine
Mathematical and Statistical Techniques
Sociology
Consortia
Child
Genetics (clinical)
0303 health sciences
Simulation and Modeling
Genomics
Identification (information)
Phenotype
Molecular Sequence Annotation
Physical Sciences
Histone Methyltransferases
Statistics (Mathematics)
Algorithms
Research Article
Adult
lcsh:QH426-470
Quantitative Trait Loci
Locus (genetics)
Computational biology
Biology
Research and Analysis Methods
ENCODE
Polymorphism, Single Nucleotide
Molecular Genetics
03 medical and health sciences
DNA-binding proteins
Genome-Wide Association Studies
Genetics
Humans
Computer Simulation
Relevance (information retrieval)
Statistical Methods
Molecular Biology
Ecology, Evolution, Behavior and Systematics
030304 developmental biology
Statistical hypothesis testing
Genetic association
Mechanism (biology)
Biology and Life Sciences
Computational Biology
Proteins
Human Genetics
Histone-Lysine N-Methyltransferase
Genome Analysis
Probability Theory
Probability Distribution
Microarray Analysis
lcsh:Genetics
030104 developmental biology
030217 neurology & neurosurgery
Mathematics
Genome-Wide Association Study
Subjects
Details
- ISSN :
- 15537404
- Volume :
- 14
- Database :
- OpenAIRE
- Journal :
- PLOS Genetics
- Accession number :
- edsair.doi.dedup.....d4a90217d7bb3f2cee0a12b990583178