Back to Search Start Over

A new tool for multi-block PLS discriminant analysis of metabolomic data: application to systems epidemiology

Authors :
Brandolini-Bunlon, Marion
Pétéra, Mélanie
Gaudreau, Pierrette
Comte, Blandine
Bougeard, Stephanie
Pujos-Guillot, Estelle
Unité de Nutrition Humaine (UNH)
Institut National de la Recherche Agronomique (INRA)-Université Clermont Auvergne [2017-2020] (UCA [2017-2020])
MetaboHUB
Centre de Recherche du Centre Hospitalier de l’Université de Montréal (CR CHUM)
Centre Hospitalier de l'Université de Montréal (CHUM)
Université de Montréal (UdeM)-Université de Montréal (UdeM)
Centre Hospitalier Universitaire de Montréal
Département de médecine
Centre Léon Bérard [Lyon]
Agence nationale de sécurité sanitaire de l'alimentation, de l'environnement et du travail (ANSES)
Unité de Nutrition Humaine - Clermont Auvergne (UNH)
Institut National de la Recherche Agronomique (INRA)-Université Clermont Auvergne (UCA)
Centre de Recherche du CHUM
Source :
12. Journées Scientifiques du Réseau Francophone de Métabolomique et Fluxomique RFMF, 12. Journées Scientifiques du Réseau Francophone de Métabolomique et Fluxomique RFMF, May 2019, Clermont-Ferrand, France, 12. Journées Scientifiques du Réseau Francophone de Métabolomique et Fluxomique RFMF, May 2019, Clermont-Ferrand, France. 2019
Publication Year :
2019
Publisher :
HAL CCSD, 2019.

Abstract

Metabolomics is a powerful phenotyping tool in nutrition and health research, generating massive and complex data that need dedicated treatments to enrich our knowledge of biological systems. In particular, to deeper investigate relations between environmental factors, phenotypes and metabolism, discriminant statistical analyses performed separately on metabolomic datasets, are often complemented by associations with metadata (anthropometric, clinical, nutritional and physical activity data…). Another relevant strategy is to perform a multi-block partial least squares discriminant analysis (MBPLSDA) that simultaneously analyses data available from different sources, allowing determining the importance of variables and variable blocks in discriminating groups of subjects, taking into account data structure in thematic blocks.In order to propose a full open-source standalone tool, the present objective was to develop an R package allowing all steps of MBPLSDA analysis for the joint analysis of metabolomic and additional data.The tool was based on the mbpls function of the ade4 R package, enriched with different functionalities, including some dedicated to discriminant analysis. Provided indicators help to determine the optimal number of components, to check the MBPLSDA model validity, and to evaluate the variability of its parameters and predictions. To illustrate the potential of the proposed tool and the associated procedure, MBPLSDA was applied to a real case study involving metabolomics, nutritional and clinical data from a human cohort.The availability of the different functionalities in a single R package allowed optimizing parameters for an efficient joint analysis of metabolomics and epidemiological data to obtain new insights into multidimensional phenotypes. In particular, we highlighted the impact of filtering the metabolomic variables beforehand, and the relevance of a MBPLSDA approach in comparison to a standard PLS-discriminant analysis method.

Details

Language :
English
Database :
OpenAIRE
Journal :
12. Journées Scientifiques du Réseau Francophone de Métabolomique et Fluxomique RFMF, 12. Journées Scientifiques du Réseau Francophone de Métabolomique et Fluxomique RFMF, May 2019, Clermont-Ferrand, France, 12. Journées Scientifiques du Réseau Francophone de Métabolomique et Fluxomique RFMF, May 2019, Clermont-Ferrand, France. 2019
Accession number :
edsair.dedup.wf.001..d4158d1ec7ae0afa9a5b8b3c45527c5c