Abstract
MapReduce is a promising approach to support data-intensive applications on Volunteer Computing Systems. Existent middleware like Bit Dew allows running MapReduce applications in a Desktop Grid environment. If the Desktop Grid is deployed in the Internet under the Volunteer Computing paradigm, it harnesses untrustable, volatile and heterogeneous resources and the results produced by MapReduce applications can be subject of sabotage. However, the implementation of large-scale MapReduce presents significant challenges with respect to the state of the art in Desktop Grid. A key issue is the design of the result certification, an operation needed to verify that malicious volunteers do not tamper with the results of computations. Because the volume of data produced and processed is so large that cannot be sent back to the server, the result certification cannot be centralized as it is currently implemented in Desktop Grid systems. In this paper we present a distributed result checker based on the Majority Voting method. We evaluate the efficiency of our approach using a model for characterizing errors and sabotage in the MapReduce paradigm. With this model, we can compute the aggregated probability with which a MapReduce implementation produces an erroneous result. The challenge is to capture the aggregated probability for the entire system, composed from probabilities resulted from the two phases of computation: Map and Reduce. We provide a detailed analysis on the performance of the result verification method and also discuss the generated overhead of managing security. We also give guidelines about how the result verification phase should be configured, given a MapReduce application.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.