Back to Search Start Over

Prediction-based structured variable selection through the receiver operating characteristic curves.

Authors :
Wang Y
Chen H
Li R
Duan N
Lewis-Fernández R
Source :
Biometrics [Biometrics] 2011 Sep; Vol. 67 (3), pp. 896-905. Date of Electronic Publication: 2010 Dec 22.
Publication Year :
2011

Abstract

In many clinical settings, a commonly encountered problem is to assess accuracy of a screening test for early detection of a disease. In these applications, predictive performance of the test is of interest. Variable selection may be useful in designing a medical test. An example is a research study conducted to design a new screening test by selecting variables from an existing screener with a hierarchical structure among variables: there are several root questions followed by their stem questions. The stem questions will only be asked after a subject has answered the root question. It is therefore unreasonable to select a model that only contains stem variables but not its root variable. In this work, we propose methods to perform variable selection with structured variables when predictive accuracy of a diagnostic test is the main concern of the analysis. We take a linear combination of individual variables to form a combined test. We then maximize a direct summary measure of the predictive performance of the test, the area under a receiver operating characteristic curve (AUC of an ROC), subject to a penalty function to control for overfitting. Since maximizing empirical AUC of the ROC of a combined test is a complicated nonconvex problem (Pepe, Cai, and Longton, 2006, Biometrics62, 221-229), we explore the connection between the empirical AUC and a support vector machine (SVM). We cast the problem of maximizing predictive performance of a combined test as a penalized SVM problem and apply a reparametrization to impose the hierarchical structure among variables. We also describe a penalized logistic regression variable selection procedure for structured variables and compare it with the ROC-based approaches. We use simulation studies based on real data to examine performance of the proposed methods. Finally we apply developed methods to design a structured screener to be used in primary care clinics to refer potentially psychotic patients for further specialty diagnostics and treatment.<br /> (© 2011, The International Biometric Society.)

Details

Language :
English
ISSN :
1541-0420
Volume :
67
Issue :
3
Database :
MEDLINE
Journal :
Biometrics
Publication Type :
Academic Journal
Accession number :
21175555
Full Text :
https://doi.org/10.1111/j.1541-0420.2010.01533.x