Abstract

This paper is motivated by the need of deadline-bounded applications in live mobile network environments to obtain the guarantee and the appropriate share of an input and output (I/O) data rate. However, data processing frameworks only support the request of memory and the computing capacity at present. In this paper, we propose a solution that allows the control of disk I/O and network I/O for data processing applications in YARN and Mesos frameworks. Experimental results show that our tool can provision the I/O data rate sharing of competing data processing applications.

Highlights

  • Introduction and MotivationWhen a specific application submits a job, a data processing framework such as Apache Hadoop [1, 26], Hadoop YARN [25], Mesos [2], reserves and allocates necessary computing resources for the execution of the job

  • Since a computing cluster can be built up of heterogeneous hardware and software components, the input and output (I/O) data rate perceived by applications is unpredictable due to the contention for the resources of the physical servers, and there is a need to monitor applications running on these platforms as well [12]

  • We demonstrate that the proposed functionalities can be integrated into two popular data processing frameworks such as Mesos and YARN to control the I/O data rates of applications, which may relieve the pain of service providers on the integration of schedulers to existing frameworks

Read more

Summary

Introduction

When a specific application submits a job, a data processing framework such as Apache Hadoop [1, 26], Hadoop YARN [25], Mesos [2], reserves and allocates necessary computing resources for the execution of the job. Since a computing cluster can be built up of heterogeneous hardware and software components, the I/O data rate perceived by applications is unpredictable due to the contention for the resources of the physical servers, and there is a need to monitor applications running on these platforms as well [12]. Nowadays telecommunication operators often apply frameworks to regularly process big data sets with specific deadlines in their computing clusters. When applications compete for a resource in the hardware and the network level, which is hidden from programmers (and from applications), they may suffer the I/O performance degradation

Objectives
Methods
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.