Back to Search Start Over

An automated multi-modal graph-based pipeline for mouse genetic discovery

Authors :
Gary Peltz
Zhuoqing Fang
Source :
Bioinformatics
Publication Year :
2022
Publisher :
Oxford University Press (OUP), 2022.

Abstract

Motivation Our ability to identify causative genetic factors for mouse genetic models of human diseases and biomedical traits has been limited by the difficulties associated with identifying true causative factors, which are often obscured by the many false positive genetic associations produced by a GWAS. Results To accelerate the pace of genetic discovery, we developed a graph neural network (GNN)-based automated pipeline (GNNHap) that could rapidly analyze mouse genetic model data and identify high probability causal genetic factors for analyzed traits. After assessing the strength of allelic associations with the strain response pattern; this pipeline analyzes 29M published papers to assess candidate gene–phenotype relationships; and incorporates the information obtained from a protein–protein interaction network and protein sequence features into the analysis. The GNN model produces markedly improved results relative to that of a simple linear neural network. We demonstrate that GNNHap can identify novel causative genetic factors for murine models of diabetes/obesity and for cataract formation, which were validated by the phenotypes appearing in previously analyzed gene knockout mice. The diabetes/obesity results indicate how characterization of the underlying genetic architecture enables new therapies to be discovered and tested by applying ‘precision medicine’ principles to murine models. Availability and implementation The GNNHap source code is freely available at https://github.com/zqfang/gnnhap, and the new version of the HBCGM program is available at https://github.com/zqfang/haplomap. Supplementary information Supplementary data are available at Bioinformatics online.

Details

ISSN :
13674811 and 13674803
Volume :
38
Database :
OpenAIRE
Journal :
Bioinformatics
Accession number :
edsair.doi.dedup.....5e176ab617277f623c190b9492d4f5a6
Full Text :
https://doi.org/10.1093/bioinformatics/btac356