1. Identifying latent disease factors differently expressed in patient subgroups using group factor analysis
- Author
-
Ferreira, Fabio S., Ashburner, John, Bouzigues, Arabella, Suksasilp, Chatrin, Russell, Lucy L., Foster, Phoebe H., Ferry-Bolder, Eve, van Swieten, John C., Jiskoot, Lize C., Seelaar, Harro, Sanchez-Valle, Raquel, Laforce, Robert, Graff, Caroline, Galimberti, Daniela, Vandenberghe, Rik, de Mendonca, Alexandre, Tiraboschi, Pietro, Santana, Isabel, Gerhard, Alexander, Levin, Johannes, Sorbi, Sandro, Otto, Markus, Pasquier, Florence, Ducharme, Simon, Butler, Chris R., Ber, Isabelle Le, Finger, Elizabeth, Tartaglia, Maria C., Masellis, Mario, Rowe, James B., Synofzik, Matthis, Moreno, Fermin, Borroni, Barbara, Kaski, Samuel, Rohrer, Jonathan D., and Mourao-Miranda, Janaina
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning - Abstract
In this study, we propose a novel approach to uncover subgroup-specific and subgroup-common latent factors addressing the challenges posed by the heterogeneity of neurological and mental disorders, which hinder disease understanding, treatment development, and outcome prediction. The proposed approach, sparse Group Factor Analysis (GFA) with regularised horseshoe priors, was implemented with probabilistic programming and can uncover associations (or latent factors) among multiple data modalities differentially expressed in sample subgroups. Synthetic data experiments showed the robustness of our sparse GFA by correctly inferring latent factors and model parameters. When applied to the Genetic Frontotemporal Dementia Initiative (GENFI) dataset, which comprises patients with frontotemporal dementia (FTD) with genetically defined subgroups, the sparse GFA identified latent disease factors differentially expressed across the subgroups, distinguishing between "subgroup-specific" latent factors within homogeneous groups and "subgroup common" latent factors shared across subgroups. The latent disease factors captured associations between brain structure and non-imaging variables (i.e., questionnaires assessing behaviour and disease severity) across the different genetic subgroups, offering insights into disease profiles. Importantly, two latent factors were more pronounced in the two more homogeneous FTD patient subgroups (progranulin (GRN) and microtubule-associated protein tau (MAPT) mutation), showcasing the method's ability to reveal subgroup-specific characteristics. These findings underscore the potential of sparse GFA for integrating multiple data modalities and identifying interpretable latent disease factors that can improve the characterization and stratification of patients with neurological and mental health disorders., Comment: 38 pages, 14 figures
- Published
- 2024