Back to Search
Start Over
Smooth-Threshold Multivariate Genetic Prediction with Unbiased Model Selection
- Source :
- Genetic Epidemiology. 40:233-243
- Publication Year :
- 2016
- Publisher :
- Wiley, 2016.
-
Abstract
- We develop a new genetic prediction method, smooth-threshold multivariate genetic prediction, using single nucleotide polymorphisms (SNPs) data in genome-wide association studies (GWASs). Our method consists of two stages. At the first stage, unlike the usual discontinuous SNP screening as used in the gene score method, our method continuously screens SNPs based on the output from standard univariate analysis for marginal association of each SNP. At the second stage, the predictive model is built by a generalized ridge regression simultaneously using the screened SNPs with SNP weight determined by the strength of marginal association. Continuous SNP screening by the smooth-thresholding not only makes prediction stable but also leads to a closed form expression of generalized degrees of freedom (GDF). The GDF leads to the Stein’s unbiased risk estimation (SURE) which enables data-dependent choice of optimal SNP screening cutoff without using cross-validation. Our method is very rapid because computationally expensive genome-wide scan is required only once in contrast to the penalized regression methods including lasso and elastic net. Simulation studies which mimic real GWAS data with quantitative and binary traits demonstrate that the proposed method outperforms the gene score method and genomic best linear unbiased prediction (GBLUP), and also shows comparable or sometimes improved performance with the lasso and elastic net being known to have good predictive ability but with heavy computational cost. Application to whole-genome sequencing (WGS) data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) exhibits that the proposed method shows higher predictive power than the gene score and GBLUP methods.
- Subjects :
- 0301 basic medicine
Elastic net regularization
Multivariate statistics
Epidemiology
Single-nucleotide polymorphism
Best linear unbiased prediction
Polymorphism, Single Nucleotide
01 natural sciences
Article
010104 statistics & probability
03 medical and health sciences
Quantitative Trait, Heritable
Lasso (statistics)
Alzheimer Disease
Statistics
Humans
0101 mathematics
Genetics (clinical)
Mathematics
Genetic association
Models, Genetic
Genome, Human
business.industry
Model selection
Reproducibility of Results
Pattern recognition
Genomics
Regression
Phenotype
030104 developmental biology
Research Design
Regression Analysis
Artificial intelligence
business
Algorithms
Genome-Wide Association Study
Subjects
Details
- ISSN :
- 07410395
- Volume :
- 40
- Database :
- OpenAIRE
- Journal :
- Genetic Epidemiology
- Accession number :
- edsair.doi.dedup.....0611bb9cd0d60cb8a9612c8ffcc70ea9
- Full Text :
- https://doi.org/10.1002/gepi.21958