Back to Search Start Over

Alternative strategies for selecting subsets of predicting SNPs by LASSO-LARS procedure

Authors :
M Graziano Usai
Sara Casu
Antonello Carta
Source :
BMC Proceedings
Publisher :
Springer Nature

Abstract

Background The least absolute shrinkage and selection operator (LASSO) can be used to predict SNP effects. This operator has the desirable feature of including in the model only a subset of explanatory SNPs, which can be useful both in QTL detection and GWS studies. LASSO solutions can be obtained by the least angle regression (LARS) algorithm. The big issue with this procedure is to define the best constraint (t), i.e. the upper bound of the sum of absolute value of the SNP effects which roughly corresponds to the number of SNPs to be selected. Usai et al. (2009) dealt with this problem by a cross-validation approach and defined t as the average number of selected SNPs overall replications. Nevertheless, in small size populations, such estimator could give underestimated values of t. Here we propose two alternative ways to define t and compared them with the "classical" one. Methods The first (strategy 1), was based on 1,000 cross-validations carried out by randomly splitting the reference population (2,000 individuals with performance) into two halves. The value of t was the number of SNPs which occurred in more than 5% of replications. The second (strategy 2), which did not use cross-validations, was based on the minimization of the Cp-type selection criterion which depends on the number of selected SNPs and the expected residual variance. Results The size of the subset of selected SNPs was 46, 189 and 64 for the classical approach, strategy 1 and 2 respectively. Classical and strategy 2 gave similar results and indicated quite clearly the regions were QTL with additive effects were located. Strategy 1 confirmed such regions and added further positions which gave a less clear scenario. Correlation between GEBVs estimated with the three strategies and TBVs in progenies without phenotypes were 0.9237, 0.9000 and 0.9240 for classical, strategy 1 and 2 respectively. Conclusions This suggests that the Cp-type selection criterion is a valid alternative to the cross-validations to define the best constraint for selecting subsets of predicting SNPs by LASSO-LARS procedure.

Details

Language :
English
ISSN :
17536561
Volume :
6
Issue :
Suppl 2
Database :
OpenAIRE
Journal :
BMC Proceedings
Accession number :
edsair.doi.dedup.....1ded3b5def64f8e3480ce2b247be9edc
Full Text :
https://doi.org/10.1186/1753-6561-6-s2-s9