1. A validated generally applicable approach using the systematic assessment of disease modules by GWAS reveals a multi-omic module strongly associated with risk factors in multiple sclerosis
- Author
-
Zelmina Lubovac-Pilav, Lars Alfredsson, Tejaswi V. S. Badam, Hendrik Arnold de Weerd, Mika Gustafsson, Ingrid Kockum, Maja Jagodic, Tomas Olsson, and David Martínez-Enguita
- Subjects
Epigenomics ,Computer science ,Network modules ,Genome-wide association study ,Disease ,QH426-470 ,Benchmark ,computer.software_genre ,Methylomics ,0302 clinical medicine ,Gene Regulatory Networks ,Multi-omics ,0303 health sciences ,3. Good health ,Identification (information) ,Benchmark (computing) ,Data integration ,Network analysis ,DNA microarray ,Medical Genetics ,Research Article ,Biotechnology ,Multiple sclerosis ,Risk factors ,Disease modules ,Protein network analysis ,Transcriptomics ,Genome-wide association analysis ,Multiple Sclerosis ,Bioinformatik och systembiologi ,Computational biology ,Biology ,03 medical and health sciences ,Genetics ,Humans ,Relevance (information retrieval) ,Epigenetics ,Medicinsk genetik ,030304 developmental biology ,Bioinformatics and Systems Biology ,Immunology in the medical area ,Omics ,Workflow ,Immunologi inom det medicinska området ,computer ,TP248.13-248.65 ,030217 neurology & neurosurgery ,Genome-Wide Association Study - Abstract
Background: There exist few, if any, practical guidelines for predictive and falsifiable multi-omic data integration that systematically integrate existing knowledge. Disease modules are popular concepts for interpreting genome-wide studies in medicine but have so far not been systematically evaluated and may lead to corroborating multi-omic modules. Result: We assessed eight module identification methods in 57 previously published expression and methylation studies of 19 diseases using GWAS enrichment analysis. Next, we applied the same strategy for multi-omic integration of 20 datasets of multiple sclerosis (MS), and further validated the resulting module using both GWAS and risk-factor-associated genes from several independent cohorts. Our benchmark of modules showed that in immune-associated diseases modules inferred from clique-based methods were the most enriched for GWAS genes. The multi-omic case study using MS data revealed the robust identification of a module of 220 genes. Strikingly, most genes of the module were differentially methylated upon the action of one or several environmental risk factors in MS (n = 217, P = 10− 47) and were also independently validated for association with five different risk factors of MS, which further stressed the high genetic and epigenetic relevance of the module for MS. Conclusions: We believe our analysis provides a workflow for selecting modules and our benchmark study may help further improvement of disease module methods. Moreover, we also stress that our methodology is generally applicable for combining and assessing the performance of multi-omic approaches for complex diseases. CC BY 4.0© 2021, The Author(s)This article is licensed under a Creative Commons Attribution 4.0 International License,which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you giveappropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate ifchanges were made. The images or other third party material in this article are included in the article's Creative Commonslicence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commonslicence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtainpermission directly from the copyright holder. To view a copy of this licence, visithttp://creativecommons.org/licenses/by/4.0/.The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to thedata made available in this article, unless otherwise stated in a credit line to the data.Correspondence: mika.gustafsson@liu.seThis work was supported by the Swedish Research Council (grant 2015–03807(M.G.), grant 2018–02638(M.J.)), the Swedish foundation for strategic research (grant SB16–0095(M.G.)), the Center for Industrial IT (CENIIT)(M.G.), European Union Horizon 2020/European Research Council Consolidator grant (Epi4MS, grant 818170(M.J.)), Knut and Alice Wallenberg Foundation (grant 2019.0089(M.J.)) and the Knowledge Foundation (grant 20170298(Z.L.)). Computational resources were granted by Swedish National Infrastructure for Computing (SNIC; SNIC 2020/5–177, LiU-2018-12 and LiU-2019-25). The funding bodies had no role in the study and collection, ana-lysis, and interpretation of data and in writing the manuscript. Open Accessfunding provided by Linköping University.
- Published
- 2021
- Full Text
- View/download PDF