Back to Search
Start Over
Semisupervised inference for explained variance in high dimensional linear regression and its applications.
- Source :
- Journal of the Royal Statistical Society: Series B (Statistical Methodology); Apr2020, Vol. 82 Issue 2, p391-419, 29p
- Publication Year :
- 2020
-
Abstract
- Summary: The paper considers statistical inference for the explained variance βTΣβ under the high dimensional linear model Y=Xβ+ε in the semisupervised setting, where β is the regression vector and Σ is the design covariance matrix. A calibrated estimator, which efficiently integrates both labelled and unlabelled data, is proposed. It is shown that the estimator achieves the minimax optimal rate of convergence in the general semisupervised framework. The optimality result characterizes how the unlabelled data contribute to the estimation accuracy. Moreover, the limiting distribution for the proposed estimator is established and the unlabelled data have also proved useful in reducing the length of the confidence interval for the explained variance. The method proposed is extended to semisupervised inference for the unweighted quadratic functional ‖β‖22. The inference results obtained are then applied to a range of high dimensional statistical problems, including signal detection and global testing, prediction accuracy evaluation and confidence ball construction. The numerical improvement of incorporating the unlabelled data is demonstrated through simulation studies and an analysis of estimating heritability for a yeast segregant data set with multiple traits. [ABSTRACT FROM AUTHOR]
- Subjects :
- SIGNAL detection
MATHEMATICAL statistics
VARIANCES
FORECASTING
STATISTICAL accuracy
Subjects
Details
- Language :
- English
- ISSN :
- 13697412
- Volume :
- 82
- Issue :
- 2
- Database :
- Complementary Index
- Journal :
- Journal of the Royal Statistical Society: Series B (Statistical Methodology)
- Publication Type :
- Academic Journal
- Accession number :
- 142312654
- Full Text :
- https://doi.org/10.1111/rssb.12357