Abstract

Information Retrieval (IR) systems are developed to fetch the most relevant content matching the user’s information needs from a pool of information. A user expects to get IR results based on the conceptual contents of the query rather than keywords. But traditional IR approaches index documents based on the terms that they contain and ignore semantic descriptions of document contents. This results in a vocabulary gap when queries and documents use different terms to describe the same concept. As a solution to this problem and to improve the performance of IR systems, we have designed a Shallow Neural Network and ontology-based novel approach for semantic document indexing (SNNOntoSDI). The SNNOntoSDI approach identifies the concepts representing a document using the word2vec model (a Shallow Neural Network) and domain ontology. The relevance of a concept in the document is measured by assigning weight to the concept based on its statistical, semantic, and scientific Named Entity features. The parameters of these feature weights are calculated using the Analytic Hierarchy Process (AHP). Finally, concepts are ranked in order of relevance. To empirically evaluate the SNNOntoSDI approach, a series of experiments were carried out on five standard publicly available datasets. The results of experiments demonstrate that the SNNOntoSDI approach outperformed state-of-the-art methods, with an average improvement of 29% and 25% in average accuracy and F-measure respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call