Abstract

A distributed file system (DFS) is a core component to implement big data applications. On the one hand, a DFS is capable of managing a large volume of data with desirable properties that strike the balance between high availability, reliability, and so on. On the other hand, a DFS relies on underlying storage systems (e.g., hard drives, solid state drives, etc.) and suffer from slow read/write operations. In big data era, large-scale data processing applications start to leverage the in-memory processing to improve the performance by reducing the inhibitive cost of I/O operations. However, it is still inevitable to read input data from or write outputs to the storage system. Slow I/O operations are often the main bottleneck of emerging big data applications. In particular, while these applications often use DFSs to store their results for the high availability and reliability, the unmanaged I/O bandwidth contention results in the QoS violation of high priority applications when multiple applications share the same DFS. To enable I/O management and allocation on big-data platforms, we propose a Cluster-Level I/O Bandwidth Enforcement (CLIBE) approach that consists of a cluster-level I/O bandwidth quota manager, multiple node-level I/O bandwidth controllers, and a feedback-based quota reallocator. The quota manager splits and distributes the I/O bandwidth quota of an application to the active nodes that are serving this application. The bandwidth controller on a node ensures that the I/O bandwidth used by an application would not exceed its bandwidth quota on the node. For an application affected by slow or overloaded nodes, the quota reallocator reallocates the idle I/O bandwidth on underloaded nodes to this application to guarantee its throughput. Our experiment on a real-system cluster shows that CLIBE is able to precisely control the I/O bandwidth used by an application at the cluster level, with the deviation smaller than 2.51%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.