Enhancing Cloud Security by Performing Deduplication Using Serial Cascaded Autoencoder With GRU and Optimal Key‐Based Data Sanitization

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

ABSTRACT De‐duplication is critically important for cloud computing since it permits the detection of repeated data within the cloud system under fewer resources and expenses. De‐duplication removes unnecessary data from the cloud centers, which helps to identify the appropriate owner of cloud material. Each piece of data saved in the cloud is owned by a large number of cloud users, even though it contains just a single copy of the data. The dynamic nature of the cloud resources is not handled by the prior deduplication models and the existing models require more computing power for accurately determining the presence of duplicate files in the cloud system. In addition, the prior models split the files into chunks for determining the similar files in the cloud system which affects the quality of the data. To conquer these difficulties, an adaptive deep learning‐based data deduplication model is developed using an optimization algorithm. The main innovation of the proposed research is to rapidly detect duplicate records in the cloud data and also provide high‐level security while maintaining the operational efficiency of the cloud system. The proposed model acts as an efficient attack resistance system and it also ensures the data availability of the cloud system more rapidly. This data deduplication implies in detecting and examining the patterns inside records of information to precisely notice and eliminate repeated identical information. Hence, the data connected to the input pattern is given to the Serial Cascaded Autoencoder with Gated Recurrent Unit (SCA‐GRU) for the deduplication process. After deduplication, the unnecessary data are removed for the precise consumption of resources to store exclusive data. To maintain the security of data, the optimal key‐based data sanitization process is performed, in which the key is optimally generated with the aid of a Mutated Fitness‐Based Krill Herd Optimization Algorithm (MF‐KHO). This encoded data is then safely kept in the cloud, which protects the data from illegal access and possible defense breaches. The outcome of the suggested approach is validated with the previous data deduplication system to show the efficiency of the developed model. The experimental results showed that the recommended deduplication approach reaches an accuracy of 95.37%. Through efficient data deduplication, the storage requirement of the data is greatly reduced, which facilitates cost reduction and resource optimization within the cloud system and also the storage capacity utilization of the cloud system is greatly improved.

Similar Papers
  • Research Article
  • Cite Count Icon 33
  • 10.1016/j.jksuci.2020.10.015
A multi-objective privacy preservation model for cloud security using hybrid Jaya-based shark smell optimization
  • Oct 22, 2020
  • Journal of King Saud University - Computer and Information Sciences
  • Danish Ahamad + 2 more

A multi-objective privacy preservation model for cloud security using hybrid Jaya-based shark smell optimization

  • Research Article
  • Cite Count Icon 59
  • 10.1016/j.jksuci.2018.04.007
Multilevel thresholding for image segmentation using Krill Herd Optimization algorithm
  • Apr 18, 2018
  • Journal of King Saud University - Computer and Information Sciences
  • K.P Baby Resma + 1 more

Multilevel thresholding for image segmentation using Krill Herd Optimization algorithm

  • Research Article
  • Cite Count Icon 47
  • 10.1109/jsen.2020.3027778
Cyber Attack Detection Process in Sensor of DC Micro-Grids Under Electric Vehicle Based on Hilbert–Huang Transform and Deep Learning
  • Sep 29, 2020
  • IEEE Sensors Journal
  • Hao Cui + 5 more

In this article, a new procedure is proposed on the basis of Hilbert-Huang Transform and deep learning for cyber-attacks detection in direct current (DC) micro-grids (MGs) as well as detection of the attacks in distributed generation (DG) units and its sensors. An advanced elective group deep learning method with Krill Herd Optimization (KHO) algorithm is proposed. At first, Hilbert-Huang Transform is used with the aim of extracting the signals feature and next these features are applied as the multiple deep input basis models are made with the aim of capturing automatically sentient traits from raw fluctuation signals. At third, to make sure the variety of the basis patterns, linear decoder, denoising autoencoder and sparse autoencoder are applied to make various deep autoencoders, respectively. Further, Bootstrap is applied with the aim of designing separate educational data subsets for any base model. Fourth, for implementing selective ensemble learning, a combination strategy of enhanced weighted voting (EWV) with class-particular thresholds is studied. Eventually, KHO algorithm is applied with the aim of adaptive selecting the optimal class-specific thresholds. In the offered tactic, firstly, a DC micro-grid is functioned and controlled with the lack of any false data injection attacks (FDIAs) to collect adequate information within the usual operation needed for the educating of deep learning networks. It is noteworthy that, in the procedure of datum production, load variable is also determined with the aim of having distinctive datasets for cyber-attack scenarios and load variables. Also, to provide more realistic method, the smart plug-in electric vehicle is also considered in the model. Outcomes of Simulation in various scenarios are applied with the aim of verifying the benefit of the offered procedure. The outcomes propose that the offered procedure is able to more accurate and robust know various type of false data injection attack over than 93.76% accuracy detection of true rate.

  • Research Article
  • Cite Count Icon 24
  • 10.1016/j.biosystems.2020.104211
Deep learning based genome analysis and NGS-RNA LL identification with a novel hybrid model
  • Aug 11, 2020
  • Biosystems
  • Madhumitha Ramamurthy + 3 more

Deep learning based genome analysis and NGS-RNA LL identification with a novel hybrid model

  • Conference Article
  • Cite Count Icon 5
  • 10.1109/iemcon.2019.8936222
Secure Textual Data Deduplication Scheme Based on Data Encoding and Compression
  • Oct 1, 2019
  • Ali Miri + 1 more

As the need for storage has grown exponentially in recent years, cloud storage has been providing a solution to this need by providing users expanded capacity and access. Providing adequate security and privacy, and lowering storage costs are some of the key challenges facing this solution. A common practice used by cloud service providers (CSPs)-data deduplication - identifies identical copies of users’ data, and removing all, but one copy to lower required storage overhead. However, this can result in serious privacy concerns. In this paper, we formulate a new secure deduplication scheme for textual data. Our proposed method uses data encoding and compression techniques that not only result in reduce storage space required, but also in saving in required transmission bandwidth. The security of the data against the semi-honest CSP and malicious users is ensured by using Burrows Wheel Transform encoding scheme. The encoded data is further compressed to gain effective savings in terms of storage and reduced size of the data. Data encoding and data compression techniques are combined together to realize secure and efficient data deduplication. Through our scheme, the CSP will not only achieve huge storage space savings through data compression and data deduplication, but can also provide the users a satisfactory level of security for their data in the cloud.

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/iceca.2018.8474932
Improving the Performance of System in Cloud by Using Selective Deduplication
  • Mar 1, 2018
  • Nishant N Pachpor + 1 more

Cloud Computing is very popular today because of large amount of data storage and fast access of data over the network. But in today’ s scenario we find the some issue to access and store data in cloud likewise data theft, data loss, privacy issue, infected application, data location, security on vendor level, security at user level and data duplication. As we find of recent study 7 Zeta Byte (ZB) data available in different storage location after 5 years it will increases the 5 times more data storage. For the better performance of system we use the different data deduplication method liked selective performance oriented data deduplication. In this paper we propose to remove data redundancy from available offline or online data storage as well as we provide security of data which helps to improve the performance of system. After deleting the data from file automatically size of file reduces and which helps to reduce the traffic on the network.

  • Research Article
  • Cite Count Icon 9
  • 10.3233/ida-150798
Fuzzy Krill Herd (FKH): An improved optimization algorithm
  • Jan 18, 2016
  • Intelligent Data Analysis
  • Edris Fattahi + 2 more

Krill Herd (KH) optimization algorithm was recently proposed based on herding behavior of krill individuals in the nature for solving optimization problems. In this paper, we develop Standard Krill Herd (SKH) algorithm and propose Fuzzy Krill Herd (FKH) optimization algorithm which is able to dynamically adjust the participation amount of exploration and exploitation by looking the progress of solving the problem in each step. In order to evaluate the proposed FKH algorithm, we utilize some standard benchmark functions and also Inventory Control Problem. Experimental results indicate the superiority of our proposed FKH optimization algorithm in comparison with the standard KH optimization algorithm.

  • Research Article
  • Cite Count Icon 128
  • 10.1109/tdsc.2018.2791432
Providing Task Allocation and Secure Deduplication for Mobile Crowdsensing via Fog Computing
  • May 1, 2020
  • IEEE Transactions on Dependable and Secure Computing
  • Jianbing Ni + 4 more

Mobile crowdsensing enables a crowd of individuals to cooperatively collect data for special interest customers using their mobile devices. The success of mobile crowdsensing largely depends on the participating mobile users. The broader participation, the more sensing data are collected; nevertheless, the more replicate data may be generated, thereby bringing unnecessary heavy communication overhead. Hence it is critical to eliminate duplicate data to improve communication efficiency, a.k.a., data deduplication. Unfortunately, sensing data is usually protected, making its deduplication challenging. In this paper, we propose a fog-assisted mobile crowdsensing framework, enabling fog nodes to allocate tasks based on user mobility for improving the accuracy of task assignment. Further, a fog-assisted secure data deduplication scheme (Fo-SDD) is introduced to improve communication efficiency while guaranteeing data confidentiality. Specifically, a BLS-oblivious pseudo-random function is designed to enable fog nodes to detect and remove replicate data in sensing reports without exposing the content of reports. To protect the privacy of mobile users, we further extend the Fo-SDD to hide users’ identities during data collection. In doing so, Chameleon hash function is leveraged to achieve contribution claim and reward retrieval for anonymous mobile users. Finally, we demonstrate that both schemes achieve secure, efficient data deduplication.

  • Preprint Article
  • 10.32920/ryerson.14668344
Privacy-preserving public auditing with data deduplication in cloud computing
  • May 24, 2021
  • Naelah Abdulrahman Alkhojandi

Storage represents one of the most commonly used cloud services. Data integrity and storage efficiency are two key requirements when storing users’ data. Public auditability, where users can employ a Third Part Auditor (TPA) to ensure data integrity, and efficient data deduplication which can be used to eliminate duplicate data and their corresponding authentication tags before sending the data to the cloud, offer possible solutions to address these requirements. In this thesis, we propose a privacy preserving public auditing scheme with data deduplication. We also present an extension of our proposed scheme that enables the TPA to perform multiple auditing tasks at the same time. Our analytical and experimental results show the efficiency of the batch auditing by reducing the number of pairing operations need for the auditing. Then, we extend our work to support user revocation where one of the users wants to leave the enterprise.

  • Research Article
  • 10.46647/ijetms.2025.v09i02.110
Secure and Efficient Data Deduplication in Joint Cloud Storage
  • Jan 1, 2025
  • international journal of engineering technology and management sciences
  • Poluka Venkata Hymasree + 1 more

Data deduplication can efficiently eliminate data redundancies in cloud storage and reduce thebandwidth requirement of users. However, most previous schemes depending on the help of atrusted key server (KS) are vulnerable and limited because they suffer from revealing information,poor resistance to attacks, great computational overhead, etc. In particular, if the trusted KS fails,the whole system stops working, i.e., single-point-of-failure. In this paper, we propose a Secure andEfficient data Deduplication scheme (named SED) in a Joint Cloud storage system which providesthe global services via collaboration with various clouds. SED also supports dynamic data updateand sharing without the help of the trusted KS. Moreover, SED can overcome the single-point-offailure that commonly occurs in the classic cloud storage system. According to the theoreticalanalyses, our SED ensures the semantic security in the random oracle model and has strong antiattack ability such as the brute-force attack resistance and the collusion attack resistance. Besides,SED can effectively eliminate data redundancies with low computational complexity andcommunication and storage overhead. The efficiency and functionality of SED improves theusability in client-side. Finally, the comparing results show that the performance of our scheme issuperior to that of the existing schemes.

  • Preprint Article
  • 10.32920/ryerson.14668344.v1
Privacy-preserving public auditing with data deduplication in cloud computing
  • May 24, 2021
  • Naelah Abdulrahman Alkhojandi

Storage represents one of the most commonly used cloud services. Data integrity and storage efficiency are two key requirements when storing users’ data. Public auditability, where users can employ a Third Part Auditor (TPA) to ensure data integrity, and efficient data deduplication which can be used to eliminate duplicate data and their corresponding authentication tags before sending the data to the cloud, offer possible solutions to address these requirements. In this thesis, we propose a privacy preserving public auditing scheme with data deduplication. We also present an extension of our proposed scheme that enables the TPA to perform multiple auditing tasks at the same time. Our analytical and experimental results show the efficiency of the batch auditing by reducing the number of pairing operations need for the auditing. Then, we extend our work to support user revocation where one of the users wants to leave the enterprise.

  • Book Chapter
  • Cite Count Icon 4
  • 10.1007/978-3-319-17040-4_3
Privacy-Preserving Public Auditing in Cloud Computing with Data Deduplication
  • Jan 1, 2015
  • Naelah Alkhojandi + 1 more

Storage represents one of the most commonly used cloud services. Data integrity and storage efficiency are two key requirements when storing users’ data. Public auditability, where users can employ a Third Party Audithor (TPA) to ensure data integrity, and efficient data deduplication which can be used to eliminate duplicate data and their corresponding authentication tags before sending the data to the cloud, offer possible solutions to address these requirements. In this paper, we propose a privacy-preserving public auditing scheme with data deduplication. We also present an extension of our proposed scheme that enables the TPA to perform multiple auditing tasks at the same time. Security and computational analyses for both cases are also presented.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/eiecs53707.2021.9587982
Secure Data Deduplication And Sharing Method Based On UMLE And CP-ABE
  • Sep 23, 2021
  • Chunbo Wang + 6 more

In the era of big data, more and more users store data in the cloud. Massive amounts of data have brought huge storage costs to cloud storage providers, and data deduplication technology has emerged. In order to protect the confidentiality of user data, user data should be encrypted and stored in the cloud. Therefore, deduplication of encrypted data has become a research hotspot. Cloud storage provides users with data sharing services, and the sharing of encrypted data is another research hotspot. The combination of encrypted data deduplication and sharing will inevitably become a future trend. The current better-performing updateable block-level message-locked encryption (UMLE) deduplication scheme does not support data sharing, and the performance of the encrypted data de-duplication scheme that introduces data sharing is not as good as that of UMLE. This paper introduces the ciphertext policy attribute based encryption (CP-ABE) system sharing mechanism on the basis of UMLE, applies the CP-ABE method to encrypt the master key generated by UMLE, to achieve secure and efficient data deduplication and sharing. In this paper, we propose a permission verification method based on bilinear mapping, and according to the definition of the security model proposed in the security analysis phase, we prove this permission verification method, showing that our scheme is secure. The comparison of theoretical analysis and simulation experiment results shows that this scheme has more complete functions and better performance than existing schemes, and the proposed authorization verification method is also secure.

  • Research Article
  • 10.52783/jisem.v10i28s.4994
Achieving Enhanced Space Efficiency and Crash Resilience in Cloud-based Garbage Collection Systems for Optimized Resource Management
  • Mar 29, 2025
  • Journal of Information Systems Engineering and Management
  • Anushree Goud

For cloud-based apps to remain scalable and performant, effective resource management is essential. High storage costs, resource contention, and system resilience are some of the particular difficulties that garbage collection, a fundamental tool for managing underutilized resources, encounters in cloud systems. In order to maximize resource use in cloud-based systems, this study proposes an enhanced trash collection architecture that improves space efficiency and crash resilience. In order to minimize system downtime and lower memory and storage needs, our method incorporates adaptive garbage collection techniques such object compaction, data deduplication, and incremental cleaning. We implement features like as fault-tolerant replication, transaction logging, and periodic checkpoints to address crash resilience, guaranteeing quick recovery and data integrity in the event of failures. After thorough testing and analysis, our suggested architecture shows notable gains in resilience and space efficiency, resulting in lower memory and storage consumption and faster crash recovery. According to the study, our method offers a solid means to efficiently manage resources in large-scale, multi-tenant cloud applications, opening the door for more durable and reasonably priced cloud infrastructure.

  • Research Article
  • 10.22632/ccs-2017-252-42
Modified Secure Data Deduplication Computing in Cloud based Environment
  • Aug 20, 2017
  • Circulation in Computer Science
  • X Alphonse Inbaraj + 1 more

Security has been a concern since the early days of computing, when a computer was isolated in a room and a threat could be posed only by malicious insiders. To support authorized Data Deduplication in cloud computing ,encryption is enhanced before outsource. Data Deduplication helps to store identical copy of data in Cloud Storage and that consumption is low bandwidth. Third Party control generates a spectrum of concerns caused by the lack of transparency and limited user control .For example , a cloud provider may subcontract some resources from a third party whose level of trust is questionable. There are examples when subcontractors fails to maintain the customer data. There are also examples when third party was not a subcontractor but a hardware supplier and the loss of data was caused by poor –quality storage devices[12].To overcome the problem of integrity and security, this paper makes the first attempt that applying Data Coloring, Watermarking techniques on shared data objects. Then applying Merkle Hash Tree[11],make tighten access control for sensitive data in both private and public clouds.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.

Search IconWhat is the difference between bacteria and viruses?
Open In New Tab Icon
Search IconWhat is the function of the immune system?
Open In New Tab Icon
Search IconCan diabetes be passed down from one generation to the next?
Open In New Tab Icon