Abstract

Random walk-based graph sampling methods can effectively estimate feature values in large-scale social networks wherein the node IDs are unknown. Real social networks are sampled by repeatedly querying their APIs to acquire the lists of adjacent nodes. These queries can then become a bottleneck in the sampling process because nearly all social network services restrict the rate at which queries can be issued. However, most existing graph sampling studies have not focused on the number of queries but have instead compared methods based on sample size. Therefore, such graph sampling methods cannot be recommended for estimating feature values in actual social networks. This study presents an approach to assess graph sampling methods that focus on the number of queries. This describes the time taken by algorithms for typical random walk-based graph sampling methods, such as a simple random walk with re-weighting (SRW-rw), a non-backtracking random walk with re-weighting (NBRW-rw), and a Metropolis ? Hastings random walk (MHRW), which require queries. The graph sampling precision was then experimentally evaluated based on sample size and query number standards using actual social networks, and the types of changes that occur were observed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call