Back to Search Start Over

bioGWAS: A Simple and Flexible Tool for Simulating GWAS Datasets.

Authors :
Changalidis, Anton I.
Alexeev, Dmitry A.
Nasykhova, Yulia A.
Glotov, Andrey S.
Barbitoff, Yury A.
Source :
Biology (2079-7737). Jan2024, Vol. 13 Issue 1, p10. 12p.
Publication Year :
2024

Abstract

Simple Summary: Genome-wide association studies (GWAS) are a powerful tool for the identification of genes affecting human traits. Still, the interpretation of GWAS results is complicated, and new tools are actively being developed. Due to the scarcity of available datasets, simulation of GWAS data with known genetic effects is important as it enables accurate evaluation of such tools. In this study, we developed a flexible tool, bioGWAS, that provides a set of important functionalities for simulating GWAS results. We demonstrate that bioGWAS can efficiently generate GWAS results with predefined causal genes and biological processes and is capable of recapitulating the results of published GWAS studies. We thus believe that bioGWAS is an excellent method for testing bioinformatics software for GWAS results processing, as well as for the generation of datasets for educational purposes. Genome-wide association studies (GWAS) have proven to be a powerful tool for the identification of genetic susceptibility loci affecting human complex traits. In addition to pinpointing individual genes involved in a particular trait, GWAS results can be used to discover relevant biological processes for these traits. The development of new tools for extracting such information from GWAS results requires large-scale datasets with known biological ground truth. Simulation of GWAS results is a powerful method that may provide such datasets and facilitate the development of new methods. In this work, we developed bioGWAS, a simple and flexible pipeline for the simulation of genotypes, phenotypes, and GWAS summary statistics. Unlike existing methods, bioGWAS can be used to generate GWAS results for simulated quantitative and binary traits with a predefined set of causal genetic variants and/or molecular pathways. We demonstrate that the proposed method can recapitulate complete GWAS datasets using a set of reported genome-wide associations. We also used our method to benchmark several tools for gene set enrichment analysis for GWAS data. Taken together, our results suggest that bioGWAS provides an important set of functionalities that would aid the development of new methods for downstream processing of GWAS results. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
20797737
Volume :
13
Issue :
1
Database :
Academic Search Index
Journal :
Biology (2079-7737)
Publication Type :
Academic Journal
Accession number :
175058706
Full Text :
https://doi.org/10.3390/biology13010010