1. CONTAMINATION-SOURCE BASED K-SAMPLE CLUSTERING
- Author
-
Milhaud, Xavier, Pommeret, Denys, Yahia Salhi, Vandekerkhove, Pierre, Aix Marseille Université (AMU), Institut de Mathématiques de Marseille (I2M), Aix Marseille Université (AMU)-École Centrale de Marseille (ECM)-Centre National de la Recherche Scientifique (CNRS), Université Claude Bernard Lyon 1 (UCBL), Université de Lyon, Laboratoire de Sciences Actuarielles et Financières [Lyon] (LSAF), Institut de Science Financière et d'Assurances (ISFA), Université Gustave Eiffel, and Chaire DIALog
- Subjects
Contamination ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,Hypothesis Testing ,Semiparametric Mixture ,Clustering - Abstract
We investigate in this work the K-sample clustering of populations issued from contamination phenomenon. A contamination model is a two-component mixture model in which one component is known (standard behaviour) when the second one, modelling a departure from the standard behaviour, is unknown. When K populations from such a model are observed we propose a semiparametric clustering methodology to detect, for coordinated diagnosis and/or best practices sharing purpose, which populations are impacted by the same type of contamination. We prove the consistency of our approach under the existence of true clusters and show the performances of our methodology through an extensive Monte Carlo study. We finally apply our methodology, implemented in the admix R package, to a European countries COVID-19 excess of mortality dataset for which we aim to cluster countries similarly impacted by the pandemic over classes of age.
- Published
- 2023