Near-accurate Multiset Reconciliation

Lailong Luo,Ori Rottenstreich,Xueshan Luo,Deke Guo,Jie Wu,Xiang Zhao

doi:10.1109/tkde.2018.2849997

Abstract

The mission of set reconciliation (also called set synchronization) is to identify those elements which appear only in exactly one of two given sets. In this paper, we extend the set reconciliation problem into three design rationales: (i) multiset support; (ii) near 100 percent reconciliation accuracy; and (iii) communication-friendly and time-saving. These three rationales, if realized, will lead to unprecedented benefits for the set reconciliation paradigm. Generally, prior reconciliation methods are mainly designed for simple sets and thus remain inapplicable for multisets. Methods based on probabilistic data structures, e.g., the Counting Bloom Filter (CBF), support efficient representation, and multiplicity queries. Based on these probabilistic data structures, approximate multiset reconciliation can be enabled. However, they often cannot achieve a statisfying accuracy, due to potential hash collisions. The reconciliations enabled by logs or lists incur high time-complexity and communication overhead. Therefore, existing reconciliation methods, fail to realize the three rationales simultaneously. To this end, we redesign Trie and Fenwick Tree (FT), to near-accurately represent and reconcile two types of multisets that we refer to as unsorted and sorted multisets, respectively. Moreover, to further reduce the communication overhead during the reconciliation process, we design a partial transmission strategy when exchanging two Tries or FTs. Comprehensive evaluations are conducted to quantify the performance of our proposals. The trace-driven evaluations demonstrate that Trie and FT achieve near-accurate multiset reconciliation, with 4.31 and 2.96 times faster than the CBF-based method, respectively. The simulations based on synthetic datasets further indicate that our proposals outperform the CBF-based method in terms of accuracy and communication overhead at most time.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Near-accurate Multiset Reconciliation

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Knowledge and Data Engineering

Lead the way for us

Journal: IEEE Transactions on Knowledge and Data Engineering	Publication Date: May 1, 2019
Citations: 9

Similar Papers

Local Differentially Private Fuzzy Counting in Stream Data Using Probabilistic Data Structures
Dinusha Vatsalan ... Raghav Bhaskar
IEEE Transactions on Knowledge and Data Engineering | VOL. -
Dinusha Vatsalan, et. al.Dinusha Vatsalan ... Raghav Bhaskar
01 Jan 2021
IEEE Transactions on Knowledge and Data Engineering | VOL. -

Using Probabilistic Data Structures for Monitoring of Multi-tenant P4-based Networks
Regis F T Martins ... Fabio L Verdi
-
Regis F T Martins, et. al.Regis F T Martins ... Fabio L Verdi
01 Jun 2018
01 Jun 2018

Probabilistic data structures for big data analytics: A comprehensive review
Amritpal Singh ... Albert Y Zomaya
Knowledge-Based Systems | VOL. 188
Amritpal Singh, et. al.Amritpal Singh ... Albert Y Zomaya
26 Aug 2019
Knowledge-Based Systems | VOL. 188

Analyzing and Assessing Pollution Attacks on Bloom Filters: Some Filters are More Vulnerable than Others
Pedro Reviriego ... Fabrizio Lombardi
-
Pedro Reviriego, et. al.Pedro Reviriego ... Fabrizio Lombardi
25 Oct 2021
25 Oct 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Near-accurate Multiset Reconciliation

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Knowledge and Data Engineering