Abstract

Data grids allow the placing of data based on two major challenges: placement of a large mass of data and job scheduling. This strategy proposes that each one is built on the other one in order to offer a high availability of storage spaces. The aim is to reduce access latencies and give improved usage of resources such as network, bandwidth, storage, and computing power. The choice of combining the two strategies in a dynamic replica placement and job scheduling, called ClusOptimizer, while using MapReduce-driven clustering to place a replica seems to be an appropriate answer to the needs since it allows us to distribute the data over all the machines of the platform. Herein, major factors which are mean job execution time, use of storage resources, and the number of active sites, can influence the efficiency. Then, a comparative study between strategies is performed to show the importance of the solution in replica placement according to jobs' frequency and the database's size in the case of biological data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.