The decentralized and highly scalable nature of structured peer-to-peer networks, based on distributed hash tables (DHTs), makes them a great fit for facilitating the interaction and exchange of information between dynamic and geographically dispersed autonomous entities. The recent emergence of multimedia-based services and applications in the Internet of Things (IoT) has led to a noticeable shift in the type of data traffic generated by sensing devices from structured textual and numerical content to unstructured and bulky multimedia content. The wide semantic spectrum of human recognizable concepts that can be stemmed from multimedia data, e.g., video and audio, introduces a very large semantic content space. The scale of the content space poses a semantic boundary between data consumers and producers in large-scale peer-to-peer publish/subscribe systems. The exact-match query model of DHTs falls short when participants use different terms to describe the same semantic concepts. In this work, we present OpenPubSub, a peer-to-peer content-based approximate semantic publish/subscribe system. We propose a hybrid event routing model that combines rendezvous routing and gossiping over a structured peer-to-peer network. The network is built on the basis of a high-dimensional semantic vector space as opposed to conventional logical key spaces. We propose methods to partition the space, construct a semantic DHT via bootstrapping, perform approximate semantic lookup operations, and cluster nodes based on their shared interests. Results show that for an approximate event matching upper bound recall of 56.7%, rendezvous-based routing achieves up to 54% recall while decreasing the messaging overhead by 44%, whereas, the hybrid routing approach achieves up to 43.8% recall while decreasing the messaging overhead by 59%.
Read full abstract