Back to Search Start Over

Contrastive linear regression

Authors :
Zhang, Boyang
Nyquist, Sarah
Jones, Andrew
Engelhardt, Barbara E.
Li, Didong
Publication Year :
2024

Abstract

Contrastive dimension reduction methods have been developed for case-control study data to identify variation that is enriched in the foreground (case) data X relative to the background (control) data Y. Here, we develop contrastive regression for the setting when there is a response variable r associated with each foreground observation. This situation occurs frequently when, for example, the unaffected controls do not have a disease grade or intervention dosage but the affected cases have a disease grade or intervention dosage, as in autism severity, solid tumors stages, polyp sizes, or warfarin dosages. Our contrastive regression model captures shared low-dimensional variation between the predictors in the cases and control groups, and then explains the case-specific response variables through the variance that remains in the predictors after shared variation is removed. We show that, in one single-nucleus RNA sequencing dataset on autism severity in postmortem brain samples from donors with and without autism and in another single-cell RNA sequencing dataset on cellular differentiation in chronic rhinosinusitis with and without nasal polyps, our contrastive linear regression performs feature ranking and identifies biologically-informative predictors associated with response that cannot be identified using other approaches

Subjects

Subjects :
Statistics - Methodology

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2401.03106
Document Type :
Working Paper