Abstract

With the emergence of big data, data scheduling is becoming an important field of research in distributed computing. Software data scheduler often relies on data management policies that can be defined by the user and provide high level features. Such advanced features become necessary nowadays to execute data intensive applications, and this implies that data and task schedulers should cooperate closely to address the large data processing issue and ensure an optimal distribution of data intensive applications. In this paper, we propose XtremDew, the data and task cooperative scheduler platform. We deal with the distribution of the optical character recognition (OCR) on large scale. We show, in particular, the benefit of the focus on data scheduling to distribute our OCR application. We build the data driven distributing platform by combining two existing middleware: BitDew, as the data scheduler, and XtremWeb-HEP, as the task scheduler. Taking advantage of both middlewares, XtremDew provides new features. To evaluate the efficiency of our approach, we compare different strategies of scheduling tasks and data and we present several scenarios that illustrate the benefits of using XtremDew to execute data-intensive applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.