Jean-Christophe Komorowski, Jean-Bernard de Chabalier, Alexis Falcin, Roberto Moretti, G. Ucciani, Arnaud Lemarchand, Jean-Philippe Métaxian, Jerome Mars, Céline Dessert, François Beauducel, Jean-Marie Saurel, Eléonore Stutzmann, Marielle Malfante, Arnaud Burtin, Institut de Physique du Globe de Paris (IPGP (UMR_7154)), Institut national des sciences de l'Univers (INSU - CNRS)-Université de La Réunion (UR)-Institut de Physique du Globe de Paris (IPG Paris)-Centre National de la Recherche Scientifique (CNRS)-Université Paris Cité (UPCité), Institut des Sciences de la Terre (ISTerre), Institut national des sciences de l'Univers (INSU - CNRS)-Institut de recherche pour le développement [IRD] : UR219-Université Savoie Mont Blanc (USMB [Université de Savoie] [Université de Chambéry])-Centre National de la Recherche Scientifique (CNRS)-Université Gustave Eiffel-Université Grenoble Alpes (UGA), GIPSA - Signal Images Physique (GIPSA-SIGMAPHY), GIPSA Pôle Sciences des Données (GIPSA-PSD), Grenoble Images Parole Signal Automatique (GIPSA-lab), Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Université Grenoble Alpes (UGA)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Université Grenoble Alpes (UGA)-Grenoble Images Parole Signal Automatique (GIPSA-lab), Université Grenoble Alpes (UGA), Observatoire Volcanologique et Sismologique de Guadeloupe (OVSG), Institut de Physique du Globe de Paris (IPG Paris), Département d'Architectures, Conception et Logiciels Embarqués-LIST (DACLE-LIST), Laboratoire d'Intégration des Systèmes et des Technologies (LIST (CEA)), Direction de Recherche Technologique (CEA) (DRT (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Direction de Recherche Technologique (CEA) (DRT (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA), Institut de Recherche pour le Développement (IRD), Institut de Physique du Globe de Paris (IPGP), Institut national des sciences de l'Univers (INSU - CNRS)-IPG PARIS-Université de La Réunion (UR)-Centre National de la Recherche Scientifique (CNRS)-Université de Paris (UP), Institut de Physique du Globe de Paris, Laboratoire d'Intégration des Systèmes et des Technologies (LIST), Falcin, A., Metaxian, J. -P., Mars, J., Stutzmann, E., Komorowski, J. -C., Moretti, R., Malfante, M., Beauducel, F., Saurel, J. -M., Dessert, C., Burtin, A., Ucciani, G., de Chabalier, J. -B., and Lemarchand, A.
International audience; The classification of seismo-volcanic signals is performed manually at La Soufrière Volcano, which is time consuming and can be biased by subjectivity of the operator. We propose here a machine-learning-based model for classification of these signals, to handle large datasets and provide objective and reproducible results. To describe the properties of the signals, we used 104 statistical, entropy, and shape descriptor features computed from the time waveform, the spectrum, and the cepstrum. First, we trained a random forest classifier with a dataset provided by the Observatoire Volcanologique et Sismologique de Guadeloupe that consisted of 845 labeled events that were recorded from 2013 to 2018: 542 volcano-tectonic (VT); 217 Nested; and 86 long period (LP). We obtained an overalll accuracy of 72%. We determined that the VT class includes a variety of signals that cover the VT, Nested and LP classes. After visual inspection of the waveforms and spectral characteristics of the data set, we introduced two new classes: Hybrid and Tornillo. A new random forest classifier was trained with this new information, and we obtained a much better overall accuracy of 82%. The model is very good for recognition of all event classes, except Hybrid events (67% accuracy, 70% precision). Hybrid events are often considered to be a mix of VT and LP events. This can be explained by the nature of this class and the physical processes that include both fracturing and resonating components with different modal frequencies. By analyzing the feature weights and by training a model with the most important features, we show that a subset of the 14 best features is sufficient to obtain a performance that is close to that of the model with the whole feature set. However, these best features are different from the 13 best features obtained for another volcano in Peru, with only one feature common to both sets of best features. Therefore, the model is not universal and it must be trained for each volcano, or it is too specific to the one station used here.