Edward De Brouwer, Thijs Becker, Lorin Werthen-Brabants, Pieter Dewulf, Dimitrios Iliadis, Cathérine Dekeyser, Guy Laureys, Bart Van Wijmeersch, Veronica Popescu, Tom Dhaene, Dirk Deschrijver, Willem Waegeman, Bernard De Baets, Michiel Stock, Dana Horakova, Francesco Patti, Guillermo Izquierdo, Sara Eichau, Marc Girard, Alexandre Prat, Alessandra Lugaresi, Pierre Grammond, Tomas Kalincik, Raed Alroughani, Francois Grand'Maison, Olga Skibina, Murat Terzi, Jeannette Lechner-Scott, Oliver Gerlach, Samia J Khoury, Elisabetta Cartechini, Vincent Van Pesch, Maria José Sà, Bianca Weinstock-Guttman, Yolanda Blanco, Radek Ampapa, Daniele Spitaleri, Claudio Solaro, Davide Maimone, Aysun Soysal, Gerardo Iuliano, Riadh Gouider, Tamara Castillo-Triviño, José Luis Sánchez-Menoyo, Anneke van der Walt, Jiwon Oh, Eduardo Aguera-Morales, Ayse Altintas, Abdullah Al-Asmi, Koen de Gans, Yara Fragoso, Tunde Csepany, Suzanne Hodgkinson, Norma Deri, Talal Al-Harbi, Bruce Taylor, Orla Gray, Patrice Lalive, Csilla Rozsa, Chris McGuigan, Allan Kermode, Angel Pérez Sempere, Simu Mihaela, Magdolna Simo, Todd Hardy, Danny Decoo, Stella Hughes, Nikolaos Grigoriadis, Attila Sas, Norbert Vella, Yves Moreau, and Liesbet Peeters
BackgroundDisability progression is a key milestone in the disease evolution of people with multiple sclerosis (PwMS). Prediction models of the probability of disability progression have not yet reached the level of trust needed to be adopted in the clinic. A common benchmark to assess model development in multiple sclerosis is also currently lacking.MethodsData of adult PwMS with a follow-up of at least three years from 146 MS centers, spread over 40 countries and collected by the MSBase consortium was used. With basic inclusion criteria for quality requirements, it represents a total of 15, 240 PwMS. External validation was performed and repeated five times to assess the significance of the results. Transparent Reporting for Individual Prognosis Or Diagnosis (TRIPOD) guidelines were followed. Confirmed disability progression after two years was predicted, with a confirmation window of six months. Only routinely collected variables were used such as the expanded disability status scale, treatment, relapse information, and MS course. To learn the probability of disability progression, state-of-the-art machine learning models were investigated. The discrimination performance of the models is evaluated with the area under the receiver operator curve (ROC-AUC) and under the precision recall curve (AUC-PR), and their calibration via the Brier score and the expected calibration error. All our preprocessing and model code are available at https://gitlab.com/edebrouwer/ms_benchmark, making this task an ideal benchmark for predicting disability progression in MS.FindingsMachine learning models achieved a ROC-AUC of 0⋅71 ± 0⋅01, an AUC-PR of 0⋅26 ± 0⋅02, a Brier score of 0⋅1 ± 0⋅01 and an expected calibration error of 0⋅07 ± 0⋅04. The history of disability progression was identified as being more predictive for future disability progression than the treatment or relapses history.ConclusionsGood discrimination and calibration performance on an external validation set is achieved, using only routinely collected variables. This suggests machine-learning models can reliably inform clinicians about the future occurrence of progression and are mature for a clinical impact study.