Abstract

A mechanism for distributed semantic similar resource search is proposed in P2P network. The mechanism is based on the content addressable network (CAN). CAN, one of P2P networks, has the natural ability to support the semantic similar search with the semantic vector space model (SVSM) of resources. However, there exists a mismatching problem between the low-dimension CAN network and the high-dimension resources, which needs a dimensionality reduction algorithm. For the semantic similar search in distributed environment of CAN, the applied dimensionality reduction algorithm needs to meet two specific requirements: maintenance for semantic similarity of SVSM of resources, and distributed computing with large and dynamic data, which is not well researched. A distributed algorithm called D-PCA is proposed based on the statistical characteristic of resources in each node. It extracts the principal components of original high-dimensional SVSM to reduce the dimension in a distributed way. D-PCA is taken as a novel hash function to project high-dimensional SVSM into low-dimensional space of distributed hash table in CAN. A semantic indexing and searching process based on semantic DHT in CAN are simulated to show the applicability of D-PCA and the effectiveness of semantic similar search.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call