Customized Filesystem with Dynamic Stripe Strategies on Lustre-Based Hadoop

Hongbo Li,Nong Xiao,Zhiguang Chen,Yuxuan Xing,Yutong Lu

doi:10.1007/978-981-10-6442-5_52

Abstract

With large-scale data exploding so quickly that the traditional big data processing framework Hadoop has met its bottleneck on data storing layer. Running Hadoop on modern HPC clusters has attracted much attention due to its unique data processing and analyzing capabilities. Lustre file system is a promising parallel storage file system occupied HPC file system market for many years. Thus, Lustre-based Hadoop platform will pose many new opportunities and challenges on today’s data era. In this paper, we customized our LustreFileSystem class which inherits from FileSystem class (inner Hadoop source code) to build our Lustre-based Hadoop. And to make full use of the high-performance in Lustre file system, we propose a novel dynamic stripe strategy to optimize stripe size during writing data to Lustre file system. Our results indicate that, we can improve the performance obviously in throughput (mb/sec) about 3x in writing and 11x in reading, and average IO rate (mb/sec) at least 3 times at the same time when compared with initial Hadoop. Besides, our dynamic stripe strategy can smooth the reading operation and give a slight improvement on writing procedure when compared with existing Lustre-based Hadoop.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Customized Filesystem with Dynamic Stripe Strategies on Lustre-Based Hadoop

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Luster a scalable architecture file system: A research implementation on active storage array framework with Luster file system
Rushikesh Salunkhe ... Naveenkumar Jayakumar
-
Rushikesh Salunkhe, et. al.Rushikesh Salunkhe ... Naveenkumar Jayakumar
01 Mar 2016
01 Mar 2016

Challenges and Opportunities of User-Level File Systems for HPC (Dagstuhl Seminar 17202)
...
-
, et. al. ...
01 Jan 2017
Challenges and Opportunities of User-Level File Systems for HPC (Dagstuhl Seminar 17202)
...

Towards an understanding of the performance of MPI-IO in Lustre file systems
Jeremy Logan ... Phillip Dickens
-
Jeremy Logan, et. al.Jeremy Logan ... Phillip Dickens
01 Sep 2008
01 Sep 2008

Using Lustre and Slurm to process Hadoop workloads and extending to the WLCG
Daniel Traynor ... P Hristov
EPJ Web of Conferences | VOL. 214
Daniel Traynor, et. al.Daniel Traynor ... P Hristov
01 Jan 2019
EPJ Web of Conferences | VOL. 214

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Customized Filesystem with Dynamic Stripe Strategies on Lustre-Based Hadoop

Abstract

Talk to us

Similar Papers