1. Leveraging a surrogate outcome to improve inference on a partially missing target outcome.
- Author
-
McCaw, Zachary R., Gaynor, Sheila M., Sun, Ryan, and Lin, Xihong
- Subjects
- *
MISSING data (Statistics) , *LOCUS (Genetics) , *FALSE positive error , *CONDITIONAL expectations , *REGRESSION analysis , *EXPECTATION-maximization algorithms - Abstract
Sample sizes vary substantially across tissues in the Genotype‐Tissue Expression (GTEx) project, where considerably fewer samples are available from certain inaccessible tissues, such as the substantia nigra (SSN), than from accessible tissues, such as blood. This severely limits power for identifying tissue‐specific expression quantitative trait loci (eQTL) in undersampled tissues. Here we propose Surrogate Phenotype Regression Analysis (Spray) for leveraging information from a correlated surrogate outcome (eg, expression in blood) to improve inference on a partially missing target outcome (eg, expression in SSN). Rather than regarding the surrogate outcome as a proxy for the target outcome, Spray jointly models the target and surrogate outcomes within a bivariate regression framework. Unobserved values of either outcome are treated as missing data. We describe and implement an expectation conditional maximization algorithm for performing estimation in the presence of bilateral outcome missingness. Spray estimates the same association parameter estimated by standard eQTL mapping and controls the type I error even when the target and surrogate outcomes are truly uncorrelated. We demonstrate analytically and empirically, using simulations and GTEx data, that in comparison with marginally modeling the target outcome, jointly modeling the target and surrogate outcomes increases estimation precision and improves power. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF