1. Identifying mortality factors from Machine Learning using Shapley values - a case of COVID19
- Author
-
Francisco Alvarez and Matthew Smith
- Subjects
0209 industrial biotechnology ,2019-20 coronavirus outbreak ,Coronavirus disease 2019 (COVID-19) ,Computer science ,COVID19 ,Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) ,Short Communication ,02 engineering and technology ,Machine learning ,computer.software_genre ,Machine Learning ,020901 industrial engineering & automation ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,Shapley Values ,business.industry ,General Engineering ,Computer Science Applications ,Coronavirus ,Variable (computer science) ,Mortality factors ,020201 artificial intelligence & image processing ,Marginal impact ,Artificial intelligence ,business ,computer - Abstract
In this paper we apply a series of Machine Learning models to a recently published unique dataset on the mortality of COVID19 patients. We use a dataset consisting of blood samples of 375 patients admitted to a hospital in the region of Wuhan, China. There are 201 patients who survived hospitalisation and 174 patients who died whilst in hospital. The focus of the paper is not only on seeing which Machine Learning model is able to obtain the absolute highest accuracy but more on the interpretation of what the Machine Learning models provides. We find that age, days in hospital, Lymphocyte and Neutrophils are important and robust predictors when predicting a patients mortality. Furthermore, the algorithms we use allows us to observe the marginal impact of each variable on a case-by-case patient level, which might help practicioneers to easily detect anomalous patterns. This paper analyses the global and local interpretation of the Machine Learning models on patients with COVID19.
- Published
- 2021