Abstract

Recent research has explored the use of Knowledge Bases (KBs) to represent documents as subgraphs of a KB concept graph and define metrics to characterize semantic relatedness of documents in terms of properties of the document concept graphs. However, none of the studies so far have examined to what degree such metrics capture a user-perceived relatedness of documents. Considering the users' explanations of how pairs of documents are related, the aim is to identify concepts in a KB graph that express the same notion of document relatedness. Our algorithm generates paths through the KB graph that originate from the terms in two documents. KB concepts where these paths intersect capture the semantic relatedness of the two starting terms and therefore the two documents. We consider how such intersecting concepts relate to the concepts in the users' explanations. The higher the users' concepts appear in the ranked list of intersecting concepts, the better the method in capturing the users' notion of document relatedness. Our experiments show that our approach outperforms a simpler graph method that uses properties of the concept nodes alone.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call