Abstract

Classical or traditional Information Retrieval (IR) approaches rely on the word-based representations of query and documents in the collection. The specification of the user information need is completely based on words figuring in the original query in order to retrieve documents containing those words. Such approaches have been limited due to the absence of relevant keywords as well as the term variation in documents and user’s query. The purpose of this paper is to present a new method to Semantic Information Retrieval (SIR) to solve the limitations of existing approaches. Concretely, we propose a novel method SIRWWO (Semantic Information Retrieval using Wikipedia, WordNet, and domain Ontologies) for SIR by combining multiple knowledge sources Wikipedia, WordNet, and Description Logic (DL) ontologies. In order to illustrate the approach SIRWWO, we first present the notion of Labeled Dynamic Semantic Network (LDSN) by extending the notions of dynamic semantic network and extended semantic net based on WordNet (and DAML ontology library). According to the notion of LDSN, we obtain the notion of Weighted Dynamic Semantic Network (WDSN, intuitively, each edge in WDSN is assigned to a number in the [0, 1] interval) and give the WDSN construction method using Wikipedia, WordNet, and DL ontology. We then propose a novel metric to measure the semantic relatedness between concepts based on WDSN. Lastly, we investigate the approach SIRWWO by using semantic relatedness between users’ query keywords and digital documents. The experimental results show that our proposals obtain comparable and better performance results than other traditional IR system Lucene.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call