Set reconciliation between different hosts to hold the same dataset is an important prerequisite in numerous distributed applications. Sending the entire dataset to achieve set reconciliation is inefficient if the majority of the data owned by each host is the same. Each host is required to send exclusive elements uniquely included in its set to the other to minimize communication complexity. This paper proposes an efficient algorithm for a host to identify its exclusive elements using the recursive comparison of ternary Bloom filters, each representing the signature of a subset of elements and used for filtering out subsets with identical elements. Hence, subsets with exclusive elements are identified, and elements included in the identified subsets are programmed to an invertible Bloom filter (IBF) to be sent to the other host. Thus, the number of elements programmed in the IBF is significantly reduced. Simulation results show that the proposed algorithm provides excellent performance compared with existing set reconciliation algorithms under the constraint of the same amount of data communication.
Read full abstract