Back to Search Start Over

Use and misuse of random forest variable importance metrics in medicine: demonstrations through incident stroke prediction.

Authors :
Wallace ML
Mentch L
Wheeler BJ
Tapia AL
Richards M
Zhou S
Yi L
Redline S
Buysse DJ
Source :
BMC medical research methodology [BMC Med Res Methodol] 2023 Jun 19; Vol. 23 (1), pp. 144. Date of Electronic Publication: 2023 Jun 19.
Publication Year :
2023

Abstract

Background: Machine learning tools such as random forests provide important opportunities for modeling large, complex modern data generated in medicine. Unfortunately, when it comes to understanding why machine learning models are predictive, applied research continues to rely on 'out of bag' (OOB) variable importance metrics (VIMPs) that are known to have considerable shortcomings within the statistics community. After explaining the limitations of OOB VIMPs - including bias towards correlated features and limited interpretability - we describe a modern approach called 'knockoff VIMPs' and explain its advantages.<br />Methods: We first evaluate current VIMP practices through an in-depth literature review of 50 recent random forest manuscripts. Next, we recommend organized and interpretable strategies for analysis with knockoff VIMPs, including computing them for groups of features and considering multiple model performance metrics. To demonstrate methods, we develop a random forest to predict 5-year incident stroke in the Sleep Heart Health Study and compare results based on OOB and knockoff VIMPs.<br />Results: Nearly all papers in the literature review contained substantial limitations in their use of VIMPs. In our demonstration, using OOB VIMPs for individual variables suggested two highly correlated lung function variables (forced expiratory volume, forced vital capacity) as the best predictors of incident stroke, followed by age and height. Using an organized analytic approach that considered knockoff VIMPs of both groups of features and individual features, the largest contributions to model sensitivity were medications (especially cardiovascular) and measured medical risk factors, while the largest contributions to model specificity were age, diastolic blood pressure, self-reported medical risk factors, polysomnography features, and pack-years of smoking. Thus, we reach very different conclusions about stroke risk factors using OOB VIMPs versus knockoff VIMPs.<br />Conclusions: The near-ubiquitous reliance on OOB VIMPs may provide misleading results for researchers who use such methods to guide their research. Given the rapid pace of scientific inquiry using machine learning, it is essential to bring modern knockoff VIMPs that are interpretable and unbiased into widespread applied practice to steer researchers using random forest machine learning toward more meaningful results.<br /> (© 2023. The Author(s).)

Details

Language :
English
ISSN :
1471-2288
Volume :
23
Issue :
1
Database :
MEDLINE
Journal :
BMC medical research methodology
Publication Type :
Academic Journal
Accession number :
37337173
Full Text :
https://doi.org/10.1186/s12874-023-01965-x