Faraz Faghri, Fabian Brunn, Anant Dadu, Elisabetta Zucchi, Ilaria Martinelli, Letizia Mazzini, Rosario Vasta, Antonio Canosa, Cristina Moglia, Andrea Calvo, Michael A Nalls, Roy H Campbell, Jessica Mandrioli, Bryan J Traynor, Adriano Chiò, Umberto Manera, Francesca Palumbo, Alessandro Bombaci, Maurizio Grassano, Maura Brunetti, Federico Casale, Giuseppe Fuda, Paolina Salamone, Barbara Iazzolino, Laura Peotta, Paolo Cugnasco, Giovanni De Marco, Maria Claudia Torrieri, Salvatore Gallone, Marco Barberis, Luca Sbaiz, Salvatore Gentile, Alessandro Mauro, Fabiola De Marchi, Lucia Corrado, Sandra D'Alfonso, Antonio Bertolotto, Daniele Imperiale, Marco De Mattei, Salvatore Amarù, Cristoforo Comi, Carmelo Labate, Fabio Poglio, Luigi Ruiz, Lucia Testa, Eugenia Rota, Paolo Ghiglione, Nicola Launaro, Alessia Di Sapio, Nicola Fini, Giulia Gianferrari, Cecilia Simonini, Stefano Meletti, Rocco Liguori, Veria Vacchiano, Fabrizio Salvi, Ilaria Bartolomei, Roberto Michelucci, Pietro Cortelli, Rita Rinaldi, Anna Maria Borghi, Andrea Zini, Elisabetta Sette, Valeria Tugnoli, Maura Pugliatti, Elena Canali, Luca Codeluppi, Franco Valzania, Lucia Zinno, Giovanni Pavesi, Doriana Medici, Giovanna Pilurzi, Emilio Terlizzi, Donata Guidetti, Silvia De Pasqua, Mario Santangelo, Patrizia De Massis, Martina Bracaglia, Mario Casmiro, Pietro Querzani, Simonetta Morresi, Marco Longoni, Alberto Patuelli, Susanna Malagù, Marco Currò Dossi, Simone Vidale, Salvatore Ferro, Faghri F., Brunn F., Dadu A., Chio A., Calvo A., Moglia C., Canosa A., Manera U., Vasta R., Palumbo F., Bombaci A., Grassano M., Brunetti M., Casale F., Fuda G., Salamone P., Iazzolino B., Peotta L., Cugnasco P., De Marco G., Torrieri M.C., Gallone S., Barberis M., Sbaiz L., Gentile S., Mauro A., Mazzini L., De Marchi F., Corrado L., D'Alfonso S., Bertolotto A., Imperiale D., De Mattei M., Amaru S., Comi C., Labate C., Poglio F., Ruiz L., Testa L., Rota E., Ghiglione P., Launaro N., Di Sapio A., Mandrioli J., Fini N., Martinelli I., Zucchi E., Gianferrari G., Simonini C., Meletti S., Liguori R., Vacchiano V., Salvi F., Bartolomei I., Michelucci R., Cortelli P., Rinaldi R., Borghi A.M., Zini A., Sette E., Tugnoli V., Pugliatti M., Canali E., Codeluppi L., Valzania F., Zinno L., Pavesi G., Medici D., Pilurzi G., Terlizzi E., Guidetti D., De Pasqua S., Santangelo M., De Massis P., Bracaglia M., Casmiro M., Querzani P., Morresi S., Longoni M., Patuelli A., Malagu S., Curro Dossi M., Vidale S., Ferro S., Nalls M.A., Campbell R.H., and Traynor B.J.
Amyotrophic lateral sclerosis (ALS) is known to represent a collection of overlapping syndromes. Various classification systems based on empirical observations have been proposed, but it is unclear to what extent they reflect ALS population substructures. We aimed to use machine-learning techniques to identify the number and nature of ALS subtypes to obtain a better understanding of this heterogeneity, enhance our understanding of the disease, and improve clinical care.In this retrospective study, we applied unsupervised Uniform Manifold Approximation and Projection [UMAP]) modelling, semi-supervised (neural network UMAP) modelling, and supervised (ensemble learning based on LightGBM) modelling to a population-based discovery cohort of patients who were diagnosed with ALS while living in the Piedmont and Valle d'Aosta regions of Italy, for whom detailed clinical data, such as age at symptom onset, were available. We excluded patients with missing Revised ALS Functional Rating Scale (ALSFRS-R) feature values from the unsupervised and semi-supervised steps. We replicated our findings in an independent population-based cohort of patients who were diagnosed with ALS while living in the Emilia Romagna region of Italy.Between Jan 1, 1995, and Dec 31, 2015, 2858 patients were entered in the discovery cohort. After excluding 497 (17%) patients with missing ALSFRS-R feature values, data for 42 clinical features across 2361 (83%) patients were available for the unsupervised and semi-supervised analysis. We found that semi-supervised machine learning produced the optimum clustering of the patients with ALS. These clusters roughly corresponded to the six clinical subtypes defined by the Chiò classification system (ie, bulbar, respiratory, flail arm, classical, pyramidal, and flail leg ALS). Between Jan 1, 2009, and March 1, 2018, 1097 patients were entered in the replication cohort. After excluding 108 (10%) patients with missing ALSFRS-R feature values, data for 42 clinical features across 989 patients were available for the unsupervised and semi-supervised analysis. All 1097 patients were included in the supervised analysis. The same clusters were identified in the replication cohort. By contrast, other ALS classification schemes, such as the El Escorial categories, Milano-Torino clinical staging, and King's clinical stages, did not adequately label the clusters. Supervised learning identified 11 clinical parameters that predicted ALS clinical subtypes with high accuracy (area under the curve 0·982 [95% CI 0·980-0·983]).Our data-driven study provides insight into the ALS population substructure and confirms that the Chiò classification system successfully identifies ALS subtypes. Additional validation is required to determine the accuracy and clinical use of these algorithms in assigning clinical subtypes. Nevertheless, our algorithms offer a broad insight into the clinical heterogeneity of ALS and help to determine the actual subtypes of disease that exist within this fatal neurodegenerative syndrome. The systematic identification of ALS subtypes will improve clinical care and clinical trial design.US National Institute on Aging, US National Institutes of Health, Italian Ministry of Health, European Commission, University of Torino Rita Levi Montalcini Department of Neurosciences, Emilia Romagna Regional Health Authority, and Italian Ministry of Education, University, and Research.For the Italian and German translations of the abstract see Supplementary Materials section.