Sliding Window Top-K Monitoring over Distributed Data Streams

Ben Chen,Zhijin Lv,Yang Liu,Xiaohui Yu

doi:10.1007/s41019-017-0053-1

Abstract

Most of the traditional top-k algorithms are based on a single-server setting. They may be highly inefficient and/or cause huge communication overhead when applied to a distributed system environment. Therefore, the problem of top-k monitoring in distributed environments has been intensively investigated recently. This paper studies how to monitor the top-k data objects with the largest aggregate numeric values from distributed data streams within a fixed-size monitoring window W, while minimizing communication cost across the network. We propose a novel algorithm, which adaptively reallocates numeric values of data objects among distributed nodes by assigning revision factors when local constraints are violated and keeps the local top-k result at distributed nodes in line with the global top-k result. We also develop a framework that combines a distributed data stream monitoring architecture with a sliding window model. Based on this framework, extensive experiments are conducted on top of Apache Storm to verify the efficiency and scalability of the proposed algorithm.

Highlights

The study of distributed top-k monitoring is significant in a variety of application scenarios, such as network monitoring, sensor data analysis, web usage logs, and market surveillance
We propose a novel algorithm, which adaptively reallocates numeric values of data objects among distributed nodes by assigning revision factors when local constraints are violated and keeps the local top-k result at distributed nodes in line with the global top-k result
We propose a novel algorithm for top-k monitoring over distributed data streams, which achieves a significant reduction in communication cost

Summary

Introduction

The study of distributed top-k monitoring is significant in a variety of application scenarios, such as network monitoring, sensor data analysis, web usage logs, and market surveillance. Consider a system that monitors a large network for distributed denial of service (DDoS) attacks. The DDoS attacks may issue an unusual large number of Domain Name Service (DNS) lookup requests to distributed DNS servers from a single IP address. It is necessary to monitor the DNS lookup requests with potential suspicious behavior. In this case, the monitoring infrastructure continuously reports the topk IP addresses with the largest number of requests at distributed servers in recent time. Since requests are frequent and rapid at distributed DNS servers, the solution of forwarding all requests to a central location and processing them is infeasible, which incurs huge communication overhead

Objectives

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Data Science and Engineering	Publication Date: Nov 15, 2017
Citations: 7	License type: open-access

R Discovery Prime

R Discovery Prime

Sliding Window Top-K Monitoring over Distributed Data Streams

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data Science and Engineering

Lead the way for us

Similar Papers

Sliding Window Top-K Monitoring over Distributed Data Streams
Zhijin Lv ... Xiaohui Yu
-
Zhijin Lv, et. al.Zhijin Lv ... Xiaohui Yu
01 Jan 2017
01 Jan 2017

Top- $\boldsymbol{k}$ query processing over uncertain data in distributed environments
Yongjiao Sun ... Ye Yuan
World Wide Web | VOL. 15
Yongjiao Sun, et. al.Yongjiao Sun ... Ye Yuan
23 Aug 2011
World Wide Web | VOL. 15

Sketch-based querying of distributed sliding-window data streams
Odysseas Papapetrou ... Antonios Deligiannakis
Proceedings of the VLDB Endowment | VOL. 5
Odysseas Papapetrou, et. al.Odysseas Papapetrou ... Antonios Deligiannakis
01 Jun 2012
Proceedings of the VLDB Endowment | VOL. 5

Sketching distributed sliding-window data streams
Odysseas Papapetrou ... Minos Garofalakis
The VLDB Journal The International Journal on Very Large Data Bases | VOL. 24
Odysseas Papapetrou, et. al.Odysseas Papapetrou ... Minos Garofalakis
10 Mar 2015
The VLDB Journal The International Journal on Very Large Data Bases | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Sliding Window Top-K Monitoring over Distributed Data Streams

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data Science and Engineering