Abstract

Low latency and high throughput are two of the most critical performance requirements for big data stream computing systems. As multi-source high-speed data streams arrive in real time, it is essential to study latency-aware and resource-aware scheduling to reduce latency and increase throughput. In this paper, we propose a latency- and resource-aware scheduling framework (Lr-Stream) targeting stream-oriented big data applications. Our key contributions can be summarized as follows: (1) a stream topology model and resource scheduling model Lr-Stream are proposed, aiming at optimizing latency and throughput; (2) a latency-aware scheduling strategy and a resource-aware scheduling strategy are proposed; (3) Lr-Stream together with monitor, calculator, and deployment function modules are implemented, and integrated into Apache Storm; (4) system metrics are thoroughly evaluated from latency- and resource-aware perspective on a typical distributed stream computing platform. Experimental results demonstrate that the proposed Lr-Stream yields significant performance improvements in terms of reducing system latency and increasing system throughput.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.