Back to Search Start Over

Estimation and Inference with Proxy Data and its Genetic Applications

Authors :
Li, Sai
Cai, T. Tony
Li, Hongzhe
Publication Year :
2022

Abstract

Existing high-dimensional statistical methods are largely established for analyzing individual-level data. In this work, we study estimation and inference for high-dimensional linear models where we only observe "proxy data", which include the marginal statistics and sample covariance matrix that are computed based on different sets of individuals. We develop a rate optimal method for estimation and inference for the regression coefficient vector and its linear functionals based on the proxy data. Moreover, we show the intrinsic limitations in the proxy-data based inference: the minimax optimal rate for estimation is slower than that in the conventional case where individual data are observed; the power for testing and multiple testing does not go to one as the signal strength goes to infinity. These interesting findings are illustrated through simulation studies and an analysis of a dataset concerning the genetic associations of hindlimb muscle weight in a mouse population.

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2201.03727
Document Type :
Working Paper