The monitoring of data streams with a network structure have drawn increasing attention due to its wide applications in modern process control. In these applications, high-dimensional sensor nodes are interconnected with an underlying network topology. In such a case, abnormalities occurring to any node may propagate dynamically across the network and cause changes of other nodes over time. Furthermore, high dimensionality of such data significantly increased the cost of resources for data transmission and computation, such that only partial observations can be transmitted or processed in practice. Overall, how to quickly detect abnormalities in such large networks with resource constraints remains a challenge, especially due to the sampling uncertainty under the dynamic anomaly occurrences and network-based patterns. In this paper, we incorporate network structure information into the monitoring and adaptive sampling methodologies for quick anomaly detection in large networks where only partial observations are available. We develop a general monitoring and adaptive sampling method and further extend it to the case with memory constraints, both of which exploit network distance and centrality information for better process monitoring and identification of abnormalities. Theoretical investigations of the proposed methods demonstrate their sampling efficiency on balancing between exploration and exploitation, as well as the detection performance guarantee. Numerical simulations and a case study on power network have demonstrated the superiority of the proposed methods in detecting various types of shifts. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Note to Practitioners</i> —Continuous monitoring of networks for anomalous events is critical for a large number of applications involving power networks, computer networks, epidemiological surveillance, social networks, etc. This paper aims at addressing the challenges in monitoring large networks in cases where monitoring resources are limited such that only a subset of nodes in the network is observable. Specifically, we integrate network structure information of nodes for constructing sequential detection methods via effective data augmentation, and for designing adaptive sampling algorithms to observe suspicious nodes that are likely to be abnormal. Then, the method is further generalized to the case that the memory of the computation is also constrained due to the network size. The developed method is greatly beneficial and effective for various anomaly patterns, especially when the initial anomaly randomly occurs to nodes in the network. The proposed methods are demonstrated to be capable of quickly detecting changes in the network and dynamically changes the sampling priority based on online observations in various cases, as shown in the theoretical investigation, simulations and case studies.
Read full abstract