1. Pulmonary emphysema subtypes defined by unsupervised machine learning on CT scans
- Author
-
Angelini, Elsa D, Yang, Jie, Balte, Pallavi P, Hoffman, Eric A, Manichaikul, Ani W, Sun, Yifei, Shen, Wei, Austin, John H M, Allen, Norrina B, Bleecker, Eugene R, Bowler, Russell, Cho, Michael H, Cooper, Christopher S, Couper, David, Dransfield, Mark T, Garcia, Christine Kim, Han, MeiLan K, Hansel, Nadia N, Hughes, Emlyn, Jacobs, David R, Kasela, Silva, Kaufman, Joel Daniel, Kim, John Shinn, Lappalainen, Tuuli, Lima, Joao, Malinsky, Daniel, Martinez, Fernando J, Oelsner, Elizabeth C, Ortega, Victor E, Paine, Robert, Post, Wendy, Pottinger, Tess D, Prince, Martin R, Rich, Stephen S, Silverman, Edwin K, Smith, Benjamin M, Swift, Andrew J, Watson, Karol E, Woodruff, Prescott G, Laine, Andrew F, and Barr, R Graham
- Abstract
BackgroundTreatment and preventative advances for chronic obstructive pulmonary disease (COPD) have been slow due, in part, to limited subphenotypes. We tested if unsupervised machine learning on CT images would discover CT emphysema subtypes with distinct characteristics, prognoses and genetic associations.MethodsNew CT emphysema subtypes were identified by unsupervised machine learning on only the texture and location of emphysematous regions on CT scans from 2853 participants in the Subpopulations and Intermediate Outcome Measures in COPD Study (SPIROMICS), a COPD case–control study, followed by data reduction. Subtypes were compared with symptoms and physiology among 2949 participants in the population-based Multi-Ethnic Study of Atherosclerosis (MESA) Lung Study and with prognosis among 6658 MESA participants. Associations with genome-wide single-nucleotide-polymorphisms were examined.ResultsThe algorithm discovered six reproducible (interlearner intraclass correlation coefficient, 0.91–1.00) CT emphysema subtypes. The most common subtype in SPIROMICS, the combined bronchitis-apical subtype, was associated with chronic bronchitis, accelerated lung function decline, hospitalisations, deaths, incident airflow limitation and a gene variant near DRD1, which is implicated in mucin hypersecretion (p=1.1 ×10−8). The second, the diffuse subtype was associated with lower weight, respiratory hospitalisations and deaths, and incident airflow limitation. The third was associated with age only. The fourth and fifth visually resembled combined pulmonary fibrosis emphysema and had distinct symptoms, physiology, prognosis and genetic associations. The sixth visually resembled vanishing lung syndrome.ConclusionLarge-scale unsupervised machine learning on CT scans defined six reproducible, familiar CT emphysema subtypes that suggest paths to specific diagnosis and personalised therapies in COPD and pre-COPD.
- Published
- 2023
- Full Text
- View/download PDF