Abstract

Security and privacy have become a great concern in cloud computing platforms in which users risk the leakage of their private data. The leakage can happen while the data is at rest (in storage), in processing, or on moving within a cloud or between different cloud infrastructures, e.g., from private to public clouds. This paper focuses on protecting data “in processing”. For big data applications, the MapReduce framework has been proven as an efficient solution and has been widely deployed, e.g., in healthcare and business data analysis. In this article, we propose a trust-based framework for MapReduce in big data processing tasks. Specifically, we first quantify and propose to assign the sensitive values for data and trust values for map and reduce slots. We then compute the trust value of each resource employed in the big data processing tasks. Depending on the data's sensitivity level of a task, the task requires a given level of trust (i.e., higher sensitive data requires servers/slots with higher trust level). The MapReduce scheduling problem is then formulated as the maximum weighted matching problem of a bipartite graph that aims to maximize the total trust value over all possible assignments subject to various trust requirement of different tasks. The problem is known to be NP-hard. To tackle it, we observe that within a computing node (VM), slots share the same trust value granted from the secured transformation phase. This helps reduce the number of slot nodes of a weight bipartite graph. Leveraging this fact, we propose an efficient heuristic algorithm that achieves 94.7 percent of the optimal solution obtained via exhaustive search. Extensive simulations show that the trust-based scheduling scheme provides much higher protection for data sensitivity while ensuring good performance for big data applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.