Abstract

In distributed storage systems, the replication mechanisms are usually used to ensure system reliability and data availability. Random replication is widely used in cloud storage systems to prevent data loss. Copyset Replication (CR) as a replication strategy, makes a nearly optimal trade-off between the number of scattered nodes and the probability of data loss. Compared with random replication, CR greatly reduces the probability of data loss caused by node failure. However, CR's random selection strategy makes it difficult to select the optimal copyset based on data characteristics such as calculation and storage. In response to this problem of CR, the Optimal Copyset Replication (OCR) proposed in this paper can select the optimal copyset according to the specified data characteristics and its corresponding node conditions. Finally, combined with Cyberspace Mimicry Defense (CMD) , we implemented OCR in a distributed object storage system and conducted related experiments. When the calculation type data reaches 300,000, the experimental results prove that compared with CR randomly selecting copyset, OCR reduces the data processing time by nearly 10% through selecting the optimal copyset. By setting relevant parameters, OCR can also ensure that the data distribution of each node is relatively uniform, and avoid data skew.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.