Back to Search Start Over

Semisupervised inference for explained variance in high dimensional linear regression and its applications.

Authors :
Tony Cai, T.
Guo, Zijian
Source :
Journal of the Royal Statistical Society: Series B (Statistical Methodology); Apr2020, Vol. 82 Issue 2, p391-419, 29p
Publication Year :
2020

Abstract

Summary: The paper considers statistical inference for the explained variance βTΣβ under the high dimensional linear model Y=Xβ+ε in the semisupervised setting, where β is the regression vector and Σ is the design covariance matrix. A calibrated estimator, which efficiently integrates both labelled and unlabelled data, is proposed. It is shown that the estimator achieves the minimax optimal rate of convergence in the general semisupervised framework. The optimality result characterizes how the unlabelled data contribute to the estimation accuracy. Moreover, the limiting distribution for the proposed estimator is established and the unlabelled data have also proved useful in reducing the length of the confidence interval for the explained variance. The method proposed is extended to semisupervised inference for the unweighted quadratic functional ‖β‖22. The inference results obtained are then applied to a range of high dimensional statistical problems, including signal detection and global testing, prediction accuracy evaluation and confidence ball construction. The numerical improvement of incorporating the unlabelled data is demonstrated through simulation studies and an analysis of estimating heritability for a yeast segregant data set with multiple traits. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13697412
Volume :
82
Issue :
2
Database :
Complementary Index
Journal :
Journal of the Royal Statistical Society: Series B (Statistical Methodology)
Publication Type :
Academic Journal
Accession number :
142312654
Full Text :
https://doi.org/10.1111/rssb.12357