Abstract

We propose and evaluate a method for obtaining more accurate search results in extensible markup language (XML) fragment search, which is a search that produces only relevant fragments or portions of an XML document. The existing approaches generate a ranked list in descending order of each XML fragment's relevance to a search query; however, these approaches often extract irrelevant XML fragments and overlook more relevant fragments. To address these problems, our approach extracts relevant XML fragments by considering the size of the fragments and the relationships between the fragments. Next, we score the XML fragments to generate a refined ranked list. For scoring, we rank the XML fragments that are informative for user information needs as high in the list. In particular, each XML fragment is scored using the statistics of its descendant and ancestor XML fragments.Our experimental evaluations show that the proposed method outperforms BM25E, a conventional approach, which neither reconstructs XML fragments nor uses descendant and ancestor statistics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call