Maliazurina B Saad, PhD, Lingzhi Hong, MD, Muhammad Aminu, MD, Natalie I Vokes, MD, Pingjun Chen, PhD, Morteza Salehjahromi, PhD, Kang Qin, MD, Sheeba J Sujit, PhD, Xuetao Lu, PhD, Elliana Young, MS, Qasem Al-Tashi, PhD, Rizwan Qureshi, PhD, Carol C Wu, ProfMD, Brett W Carter, ProfMD, Steven H Lin, ProfMD, Percy P Lee, ProfMD, Saumil Gandhi, MD, Joe Y Chang, ProfMD, Ruijiang Li, PhD, Michael F Gensheimer, MD, Heather A Wakelee, ProfMD, Joel W Neal, MD, Hyun-Sung Lee, MD, Chao Cheng, PhD, Vamsidhar Velcheti, ProfMD, Yanyan Lou, MD, Milena Petranovic, MD, Waree Rinsurongkawong, PhD, Xiuning Le, MD, Vadeerat Rinsurongkawong, PhD, Amy Spelman, PhD, Yasir Y Elamin, MD, Marcelo V Negrao, MD, Ferdinandos Skoulidis, MD, Carl M Gay, MD, Tina Cascone, MD, Mara B Antonoff, MD, Boris Sepesi, MD, Jeff Lewis, BS, Ignacio I Wistuba, ProfMD, John D Hazle, ProfPhD, Caroline Chung, MD, David Jaffray, ProfPhD, Don L Gibbons, ProfMD, Ara Vaporciyan, ProfMD, J Jack Lee, ProfPhD, John V Heymach, ProfMD, Jianjun Zhang, MD, and Jia Wu, PhD
Summary: Background: Only around 20–30% of patients with non-small-cell lung cancer (NCSLC) have durable benefit from immune-checkpoint inhibitors. Although tissue-based biomarkers (eg, PD-L1) are limited by suboptimal performance, tissue availability, and tumour heterogeneity, radiographic images might holistically capture the underlying cancer biology. We aimed to investigate the application of deep learning on chest CT scans to derive an imaging signature of response to immune checkpoint inhibitors and evaluate its added value in the clinical context. Methods: In this retrospective modelling study, 976 patients with metastatic, EGFR/ALK negative NSCLC treated with immune checkpoint inhibitors at MD Anderson and Stanford were enrolled from Jan 1, 2014, to Feb 29, 2020. We built and tested an ensemble deep learning model on pretreatment CTs (Deep-CT) to predict overall survival and progression-free survival after treatment with immune checkpoint inhibitors. We also evaluated the added predictive value of the Deep-CT model in the context of existing clinicopathological and radiological metrics. Findings: Our Deep-CT model demonstrated robust stratification of patient survival of the MD Anderson testing set, which was validated in the external Stanford set. The performance of the Deep-CT model remained significant on subgroup analyses stratified by PD-L1, histology, age, sex, and race. In univariate analysis, Deep-CT outperformed the conventional risk factors, including histology, smoking status, and PD-L1 expression, and remained an independent predictor after multivariate adjustment. Integrating the Deep-CT model with conventional risk factors demonstrated significantly improved prediction performance, with overall survival C-index increases from 0·70 (clinical model) to 0·75 (composite model) during testing. On the other hand, the deep learning risk scores correlated with some radiomics features, but radiomics alone could not reach the performance level of deep learning, indicating that the deep learning model effectively captured additional imaging patterns beyond known radiomics features. Interpretation: This proof-of-concept study shows that automated profiling of radiographic scans through deep learning can provide orthogonal information independent of existing clinicopathological biomarkers, bringing the goal of precision immunotherapy for patients with NSCLC closer. Funding: National Institutes of Health, Mark Foundation Damon Runyon Foundation Physician Scientist Award, MD Anderson Strategic Initiative Development Program, MD Anderson Lung Moon Shot Program, Andrea Mugnaini, and Edward L C Smith.