1. Network embedding unveils the hidden interactions in the mammalian virome
- Author
-
Poisot, Timothée, Ouellet, Marie-Andrée, Mollentze, Nardus, Farrell, Maxwell J., Becker, Daniel J., Brierly, Liam, Albery, Gregory F., Gibb, Rory J., Seifert, Stephanie N., and Carlson, Colin J.
- Subjects
Quantitative Biology - Quantitative Methods - Abstract
At most 1-2% of the global virome has been sampled to date. Recent work has shown that predicting which host-virus interactions are possible but undiscovered or unrealized is, fundamentally, a network science problem. Here, we develop a novel method that combines a coarse recommender system (Linear Filtering; LF) with an imputation algorithm based on low-rank graph embedding (Singular Value Decomposition; SVD) to infer host-virus associations. This combination of techniques results in informed initial guesses based on directly measurable network properties (density, degree distribution) that are refined through SVD (which is able to leverage emerging features). Using this method, we recovered highly plausible undiscovered interactions with a strong signal of viral coevolutionary history, and revealed a global hotspot of unusually unique but unsampled (or unrealized) host-virus interactions in the Amazon rainforest. We develop several tests for quantifying the bias and realism of these predictions, and show that the LF-SVD method is robust in each aspect. We finally show that graph embedding of the imputed network can be used to improve predictions of human infection from viral genome features, showing that the global structure of the mammal-virus network provides additional insights into human disease emergence., Comment: 35 pages, 2 figures, 9 extended data figures, 7 extended data tables
- Published
- 2021