Abstract

With the growth of Web information, traditional search engines, which are built on the text-based search technology, are unable to meet users’ demands on Web search. As many queries are time-related, and most Web pages contain time information, it has been an important issue to develop time-aware Web search engines. Based on this view, in this paper we study the indexing mechanism of the temporal information in Web pages. Our work is based on the assumption that each Web page only has one primary time, which will be utilized in time-based Web search. We present a new index structure called BT+-tree which is based on the MAP21-tree. However, unlike MAP21-tree’s double-tree structure, BT+-tree only uses one tree structure. Furthermore, duplicated keys can be effectively treated in BT+-tree, while the MAP21-tree has little consideration on duplicated keys. After discussing the index structure as well as manipulation algorithms of BT+-tree, we design a testing program to measure the performance of BT+-tree. The experimental results show that BT+-tree is effective for indexing temporal information in Web pages.KeywordsTemporal InformationIndex StructureVersus Versus Versus VersusVersus Versus Versus Versus VersusIndex SizeThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call