Abstract

Time plays important roles in Web search, because most Web pages contain time information and a lot of Web queries are time-related. However, traditional search engines such as Google have little consideration on the time information in Web pages. In particular, they do not take into account the time information of Web pages when ranking searching results. In this paper, we present a new timeaware ranking algorithm for Web search, which is called CT-Rank (Content-Time-based Ranking). The algorithm uses three factors of a Web page, namely the Pagerank value, the title ranking score, and the time-constrained keyword ranking score, to sort search results, and we develop a two-stage algorithm to realize the time-based ranking. We conduct a comprehensive experiment on 6,500 Web pages which is manually collected through Google, and compare the performance of CT-Rank with other four competitor algorithms including Pagerank, vector space model based ranking, update time based ranking, and Google’s ranking algorithm. The experimental result shows that CT-Rank has the best performance under different temporal textual queries.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call