Abstract

Electronic mail (Email) or the paperless mail is becoming the most acceptable, faster and cheapest way of formal and informal information sharing between users. Around 500 billion mails are sent each day and the count is expected to be increasing. Today, even the sensitive and private information are shared through emails, thus making it the primary target for attackers and hackers. Also, the companies having their own mail server, relies on cloud system for storing the mails at a lower cost and maintenance. This affected the privacy of users as the searching pattern is visible to the cloud. To rectify this, we need to have a secure architecture for storing the emails and retrieve them according to the user queries. Data as well as the queries and computations to retrieve the relevant mails should be hidden from the third party. This article proposes a modified homomorphic encryption (MHE) technique to secure the mails. Homomorphic encryption is made practical using MHE and by incorporating Map Reduce parallel programming model, the execution time is exponentially reduced. Well known techniques in information retrieval, like Vector Space model and Term Frequency – Inverse Document Frequency (TF-IDF) concepts are utilized for finding relevant mails to the query. The analysis done on the dataset proves that our method is efficient in terms of execution time and in ensuring the security of the data and the privacy of the users.

Highlights

  • Today, the data is evolving at an enormous rate and Cloud Computing paved the way to economic and easy storage of Big Data

  • Term Frequency – Inverse Document Frequency is a statistical measure used to evaluate the importance of a word in a document, or a corpus

  • Term Frequency implies the cardinality of occurrence of each word in a document and Inverse Document Frequency implies the importance of a word in the entire corpus

Read more

Summary

A Technique for Privacy Preserving Big Data Search

Division of Computer Science, School of Engineering, Cochin University of Science & Technology, CUSAT, Kochi, Kerala, India. The companies having their own mail server, relies on cloud system for storing the mails at a lower cost and maintenance. This affected the privacy of users as the searching pattern is visible to the cloud. We need to have a secure architecture for storing the emails and retrieve them according to the user queries. This article proposes a modified homomorphic encryption (MHE) technique to secure the mails. Well known techniques in information retrieval, like Vector Space model and Term Frequency – Inverse Document Frequency (TF-IDF) concepts are utilized for finding relevant mails to the query.

INTRODUCTION
RELATED WORK
Vector Space Model
BACKGROUND
TF-IDF Calculation
Proof of Correctness for MHE Algorithm
SYSTEM DESIGN
Secure Mail Storage for Secure Retrieval
Secure and Ranked Mail Retrieval
Improving the Ranking of Mails
SECURITY ANALYSIS
Security of the Homomorphic Encryption Scheme used for Securing the Index
Security of the Index Creation and Information Retrieval Scheme
ACCELERATING MHE IMPLEMENTATION USING MAP REDUCE
Dataset
Performance Analysis
VIII. EXPERIMENTAL SETUP AND EVALUATIONS
CONCLUSION AND FUTURE WORKS
Communication Overhead

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.