Abstract
How can we estimate local triangle counts accurately in a graph stream without storing the whole graph? How to handle duplicated edges in local triangle counting for graph stream? Local triangle counting, which computes the number of triangles attached to each node in a graph, is a very important problem with wide applications in social network analysis, anomaly detection, web mining, and the like. In this article, we propose algorithms for local triangle counting in a graph stream based on edge sampling: M ascot for a simple graph, and M ulti BM ascot and M ulti WM ascot for a multigraph. To develop M ascot , we first present two naive local triangle counting algorithms in a graph stream, called M ascot -C and M ascot -A. M ascot -C is based on constant edge sampling, and M ascot -A improves its accuracy by utilizing more memory spaces. M ascot achieves both accuracy and memory-efficiency of the two algorithms by unconditional triangle counting for a new edge, regardless of whether it is sampled or not. Extending the idea to a multigraph, we develop two algorithms M ulti BM ascot and M ulti WM ascot . M ulti BM ascot enables local triangle counting on the corresponding simple graph of a streamed multigraph without explicit graph conversion; M ulti WM ascot considers repeated occurrences of an edge as its weight and counts each triangle as the product of its three edge weights. In contrast to the existing algorithm that requires prior knowledge on the target graph and appropriately set parameters, our proposed algorithms require only one parameter of edge sampling probability. Through extensive experiments, we show that for the same number of edges sampled, M ascot provides the best accuracy compared to the existing algorithm as well as M ascot -C and M ascot -A. We also demonstrate that M ulti BM ascot on a multigraph is comparable to M ascot -C on the counterpart simple graph, and M ulti WM ascot becomes more accurate for higher degree nodes. Thanks to M ascot , we also discover interesting anomalous patterns in real graphs, including core-peripheries in the web, a bimodal call pattern in a phone call history, and intensive collaboration in DBLP.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.