A Strategy of Data Synchronization in Distributed System with Read Separating from Write

Jintao Gao,Wenjie Liu,Zhanhuai Li

doi:10.1051/jnwpu/20203810209

Abstract

Read separating from write is a strategy that NewSQL adopts to incorporate the advantages of traditional relation database and NoSQL database. Under this architecture, baseline data is split into multiple partitions stored at distributed physical nodes, while delta data is stored at single transaction node. For reducing the pressure of transaction node and improving the query performance, delta data needs to be synchronized into storage nodes. The current strategies trigger the procedure of data synchronization per partition, meaning that unchanged partitions will also participate in data synchronization, which consumes extra network cost, local IO and space resources. For improving the efficiency of data synchronization meanwhile mitigating space utilization, the fine-grained data synchronization strategy is proposed, whose main idea includes that fine-grained logical partitions upon original coarse-grained partitions is established, providing more correct synchronized unit; the delta data sensing strategy is introduced, which records the mapping between changed partitions and its delta data; instead of partition driven, the data synchronization through the delta-broadcasting mechanism is driven, constraining that only changed partitions can participate in data synchronization. The fine-grained data synchronization strategy on Oceanbase is implemented, which is a distributed database with read separating from write, and the results show that our strategy is better than other strategies in efficiency of data synchronizing and space utilization.

Highlights

The fine⁃grained data synchronization strategy on Oceanbase is implemented, which is a distributed database with read separating from write, and the results show that our strategy is better than other strategies in efficiency of data synchronizing and space utilization
Https: / / github.com / Level / levelup / blob / master / README.md

Summary

Introduction

西北工业大学学报 Journal of Northwestern Polytechnical University https: / / doi.org / 10.1051 / jnwpu / 20203810209 摘要:读写分离是 NewSQL 数据库兼容传统关系型数据库以及 NoSQL 数据库各自优势的一种常用策略。这种架构下,基线数据被分割为多个分区分布存储于不同存储节点,更改数据存储于单个事务节点,为减轻事务节点压力以及提升查询效率,需要将更改数据定期同步到存储节点。当前策略以分区粒度进行数据同步,导致无更改数据的分区同样参与同步操作,消耗额外网络代价、本地 IO 代价、内存空间以及磁盘空间。为提升同步效率,降低空间消耗,提出一种细粒度数据同步策略,在原始分区之上建立细粒度逻辑分区,提供更精确的同步单位;引入更改感知策略,记录被更改的分区以及对应的更改数据;利用更改发布机制驱动同步的进行,限制参与同步的分区为发生改变的分区。在分布式读写分离系统 Oceanbase 上验证细粒度数据同步策略,结果表明其同步效率和空间占用量均优于其他策略。对于传统集中式数据库系统,如 PostgreSQL[7] 、 MySQL[8] 、DB2[9] 等,更改操作发生在本地,不涉及跨节点数据同步。 NoSQL 数据库系统, 如 Big⁃ Table[10] 、HBase[11] 等, 牺牲一致性, 追求高扩展。 HBase 将大表切分成多个子表,分布存储在多个物理节点,只支持行级别事务, 但能够节点内本地更改。大数据时代应运而生的 NewSQL 数据库系统, 如 MemSQL[2] 、Oceanbase[4] 、VoltDB[3] 等,在兼容大数据分析和传统事务处理能力的同时,面临数据同步的性能问题,特别是读写分离情况下,提升性能挑战巨大。如 Oceanbase 中数据同步过程为:以分区为单位,首先通过网络到事务节点拉取更改的数据到存储节点,然后存储节点将分区( 默认 256 MB) 的数据全部加载到内存,执行合并操作,最后将合并后的数据落盘。

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University	Publication Date: Feb 1, 2020
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Strategy of Data Synchronization in Distributed System with Read Separating from Write

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University

Lead the way for us

Similar Papers

The design and implementation of key server data store based on private cloud localization
Yi Han ... Mingwo Zhou
-
Yi Han, et. al. Yi Han ... Mingwo Zhou
01 Oct 2016
01 Oct 2016

A Review of Data Synchronization and Consistency Frameworks for Mobile Cloud Applications
Yunus Parvej Faniband ... Iskandar Ishak
International Journal of Advanced Computer Science and Applications | VOL. 9
Yunus Parvej Faniband, et. al.Yunus Parvej Faniband ... Iskandar Ishak
01 Jan 2018
International Journal of Advanced Computer Science and Applications | VOL. 9

Data security storage model for fog computing in large-scale IoT application
Shuqing He ... Haifeng Wang
-
Shuqing He, et. al.Shuqing He ... Haifeng Wang
01 Apr 2018
01 Apr 2018

Research on Data Synchronization Method for Differential Protection of Distribution Network Based on 5G Communication
Jiajia Zhang ... Tianjun Jing
-
Jiajia Zhang, et. al.Jiajia Zhang ... Tianjun Jing
01 Apr 2022
01 Apr 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Strategy of Data Synchronization in Distributed System with Read Separating from Write

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University