SemanticPeer: A distributional semantic peer-to-peer lookup protocol for large content spaces at internet-scale

Tarek Zaarour,Edward Curry

doi:10.1016/j.future.2022.02.016

Abstract

Peer-to-peer networks offer a solid foundation for wide-scale resource sharing, collaborative computing, and data distribution. Such networks have been commonly used for group communication by overlaying a publish/subscribe service atop their routing substrate. In this work, we focus on offering a group communication service that targets unstructured content such as images and videos for dissemination at internet scale. The decoupled nature of publish/subscribe systems exacerbated by the decentralized and large-scale nature of peer-to-peer networks brings about a semantic boundary between publishers and subscribers. More precisely, the large semantic space of human-level recognition creates a very large content space of object labels, attributes, and relationships. The scale of the content space makes it nearly impossible for participants to agree on a bounded set of terms for subscribers to express their exact interests. We identify an inherent limitation of peer-to-peer networks lying in the exact-match property of their key-based routing primitives. We propose an approximate matching model where participants agree on a distributional model of word meaning that maps terms to a vector space. We overcome the exact-match limitation by proposing a novel distributed lookup protocol and algorithm to construct a peer-to-peer network and route content. We replace conventional logical key spaces with a high-dimensional vector space that preserves the semantic properties of the data being mapped. Experiments show that the proposed model achieves more than 97% recall in routing accuracy, that is, locating a node responsible for storing a data item in a few routing hops. Furthermore, results also show that the network achieves over 90% recall in approximately matching two semantically related terms via rendezvous routing.

Full Text