Back to Search Start Over

Benefits of Bias in Crawl-Based Network Sampling for Identifying Key Node Set

Authors :
Sho Tsugawa
Hiroyuki Ohsaki
Source :
IEEE Access, Vol 8, Pp 75370-75380 (2020)
Publication Year :
2020
Publisher :
IEEE, 2020.

Abstract

We study the problem of identifying a set of key nodes from a network when limited knowledge about its structure is available. Most studies assume complete knowledge of the given network when identifying a set of key nodes, but in current practice, networks of interest are often too huge to obtain their entire topological structures. When the complete structure of a network is not available, network sampling strategies are often used to obtain a partial structure of the network. We investigate how network sampling strategies affect the problem of identifying a key node set. Specifically, we investigate the effect of conventional network sampling strategies on the solutions found for two types of key node set identification problems: the minimum $p$ -median problem and the influence maximization problem. Our results show that when the network is obtained using crawl-based network sampling strategies, both the minimum $p$ -median and the influence maximization problems are effectively solved by simple heuristic algorithms with sampling ratios in the 10-20% range. We also find that among three conventional sampling strategies (random sampling, random walk sampling, and sample edge counts) checked in this paper, random walk sampling is the most robust strategy in terms of effectively identifying the key node sets of diverse types of networks.

Details

Language :
English
ISSN :
21693536
Volume :
8
Database :
Directory of Open Access Journals
Journal :
IEEE Access
Publication Type :
Academic Journal
Accession number :
edsdoj.94606d7f2fe448db8fb0ad06111f4f84
Document Type :
article
Full Text :
https://doi.org/10.1109/ACCESS.2020.2988910