Abstract

Nowadays, enormous volume of information is generated by various means of digital transactions and applications in each second. There are many frameworks and tools available in the market to organize and analyze these BigData. In 2006, Hadoop version 1.0 released in the market with distributed storage capability and MapReduce Programming model. Due to the scalability and resource sharing limitations of Hadoop MapReduce1, the open source community has introduced the next generation MapReduce called Yet Another Resource Negotiator (YARN). YARN is a generic resource platform to manage and analyze resources typically to perform operational applications across Hadoop cluster. It offers superior resource management, multi-tenancy and linear-scale storage. The Schedulers in YARN are playing a vital role to control the order of job execution and assign jobs as per the resource request. This paper is a detailed study on YARN and different default and pluggable YARN Schedulers. It discusses various issues in job scheduling with respect to different schedulers. Moreover, it focuses a comparison study of scheduling improvements between Hadoop1 and Hadoop YARN. This paper also reveals the data locality advances over MapReduce1.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call