Back to Search Start Over

SANA: cross-species prediction of Gene Ontology GO annotations via topological network alignment

Authors :
Siyue Wang
Giles R. S. Atkinson
Wayne B. Hayes
Source :
npj Systems Biology and Applications, Vol 8, Iss 1, Pp 1-17 (2022)
Publication Year :
2022
Publisher :
Nature Portfolio, 2022.

Abstract

Abstract Topological network alignment aims to align two networks node-wise in order to maximize the observed common connection (edge) topology between them. The topological alignment of two protein–protein interaction (PPI) networks should thus expose protein pairs with similar interaction partners allowing, for example, the prediction of common Gene Ontology (GO) terms. Unfortunately, no network alignment algorithm based on topology alone has been able to achieve this aim, though those that include sequence similarity have seen some success. We argue that this failure of topology alone is due to the sparsity and incompleteness of the PPI network data of almost all species, which provides the network topology with a small signal-to-noise ratio that is effectively swamped when sequence information is added to the mix. Here we show that the weak signal can be detected using multiple stochastic samples of “good” topological network alignments, which allows us to observe regions of the two networks that are robustly aligned across multiple samples. The resulting network alignment frequency (NAF) strongly correlates with GO-based Resnik semantic similarity and enables the first successful cross-species predictions of GO terms based on topology-only network alignments. Our best predictions have an AUPR of about 0.4, which is competitive with state-of-the-art algorithms, even when there is no observable sequence similarity and no known homology relationship. While our results provide only a “proof of concept” on existing network data, we hypothesize that predicting GO terms from topology-only network alignments will become increasingly practical as the volume and quality of PPI network data increase.

Subjects

Subjects :
Biology (General)
QH301-705.5

Details

Language :
English
ISSN :
20567189
Volume :
8
Issue :
1
Database :
Directory of Open Access Journals
Journal :
npj Systems Biology and Applications
Publication Type :
Academic Journal
Accession number :
edsdoj.0a06cd6758946399fd0f5c8722203b7
Document Type :
article
Full Text :
https://doi.org/10.1038/s41540-022-00232-x