1. Optimal Network Pairwise Comparison
- Author
-
Jin, Jiashun, Ke, Zheng Tracy, Luo, Shengming, and Ma, Yucong
- Subjects
Statistics - Methodology ,62H30, 91C20 - Abstract
We are interested in the problem of two-sample network hypothesis testing: given two networks with the same set of nodes, we wish to test whether the underlying Bernoulli probability matrices of the two networks are the same or not. We propose Interlacing Balance Measure (IBM) as a new two-sample testing approach. We consider the {\it Degree-Corrected Mixed-Membership (DCMM)} model for undirected networks, where we allow severe degree heterogeneity, mixed-memberships, flexible sparsity levels, and weak signals. In such a broad setting, how to find a test that has a tractable limiting null and optimal testing performances is a challenging problem. We show that IBM is such a test: in a broad DCMM setting with only mild regularity conditions, IBM has $N(0,1)$ as the limiting null and achieves the optimal phase transition. While the above is for undirected networks, IBM is a unified approach and is directly implementable for directed networks. For a broad directed-DCMM (extension of DCMM for directed networks) setting, we show that IBM has $N(0, 1/2)$ as the limiting null and continues to achieve the optimal phase transition. We have also applied IBM to the Enron email network and a gene co-expression network, with interesting results., Comment: 92 pages
- Published
- 2024