In this paper we study the problem of walk-specific information spread in directed complex social networks. Classical models usually analyze the “explosive” spread of information on social networks (e.g., Twitter) – a broadcast or epidemiological model focusing on the dynamics of a given source node “infecting” multiple targets. Less studied, but of equal importance is the case of single-track information flow, wherein the focus is on the node-by-node (and not necessarily a newly visited node) trajectory of information transfer. An important and motivating example is the sequence of physicians visited by a given patient over a presumed course of treatment or health event. This is the so-called a referral sequence which manifests as a path in a network of physicians. In this case the patient (and her health record) is a source of “information" from one physician to the next. With this motivation in mind we build a Bayesian Personalized Ranking (BPR) model to predict the next node on a walk of a given network navigator using network science features. The problem is related to but different from the well-investigated link prediction problem. We present experiments on a dataset of several million nodes derived from several years of U.S. patient referral records, showing that the application of network science measures in the BPR framework boosts hit-rate and mean percentile rank for the task of next-node prediction. We then move beyond the simple information walk to consider the derived network space of all information walks within a period, in which a node represents an information walk and two information walks are connected if have nodes in common from the original (social) network. To evaluate the utility of such a network of information walks, we simulate outliers of information walks and distinguish them with the other normal information walks, using five distance metrics for the derived feature vectors between two information walks. The experimental results of such a proof-of-concept application shows the utility of the derived information walk network for the outlier monitoring of information flow on an intelligent network.
Read full abstract