Back to Search Start Over

Federated Learning for multi-omics: a performance evaluation in Parkinson's disease.

Authors :
Danek B
Makarious MB
Dadu A
Vitale D
Lee PS
Nalls MA
Sun J
Faghri F
Source :
BioRxiv : the preprint server for biology [bioRxiv] 2024 Feb 12. Date of Electronic Publication: 2024 Feb 12.
Publication Year :
2024

Abstract

While machine learning (ML) research has recently grown more in popularity, its application in the omics domain is constrained by access to sufficiently large, high-quality datasets needed to train ML models. Federated Learning (FL) represents an opportunity to enable collaborative curation of such datasets among participating institutions. We compare the simulated performance of several models trained using FL against classically trained ML models on the task of multi-omics Parkinson's Disease prediction. We find that FL model performance tracks centrally trained ML models, where the most performant FL model achieves an AUC-PR of 0.876 ± 0.009, 0.014 ± 0.003 less than its centrally trained variation. We also determine that the dispersion of samples within a federation plays a meaningful role in model performance. Our study implements several open source FL frameworks and aims to highlight some of the challenges and opportunities when applying these collaborative methods in multi-omics studies.<br />Competing Interests: Declaration of Interests B.D., A.D., D.V., M.A.N., and F.F.'s declare no competing non-financial interests but the following competing financial interests as their participation in this project was part of a competitive contract awarded to Data Tecnica LLC by the National Institutes of Health to support open science research. M.A.N. also currently serves on the scientific advisory board for Character Bio and is an advisor to Neuron23 Inc. The study's funders had no role in the study design, data collection, data analysis, data interpretation, or writing of the report. Authors M.B.M, P.S.L and J.S. declare no competing financial or non-financial interests. All authors and the public can access all data and statistical programming code used in this project for the analyses and results generation. F.F. takes final responsibility for the decision to submit the paper for publication.

Details

Language :
English
ISSN :
2692-8205
Database :
MEDLINE
Journal :
BioRxiv : the preprint server for biology
Publication Type :
Academic Journal
Accession number :
37986893
Full Text :
https://doi.org/10.1101/2023.10.04.560604