Abstract

Web search engines are becoming a major platform for the general public to access information. It has been suggested that because the search patterns of search engine users are correlated with emerging events, the query log of search engines has the potential for trend surveillance, such as monitoring outbreaks of epidemics. Many trend surveillance studies have investigated the use of query logs and have strived to identify query terms suitable for trend surveillance. Most of these works select representative query terms by consulting domain experts or by preparing a large text corpus for feature selection. The process of these approaches, however, is too costly to make the trend surveillance methods adaptable to different topics. In this paper, we propose an adaptive trend surveillance method. We developed a simple and effective feature selection algorithm, called TF-LTR, which leverages the document returned by search engines and the frequency of the terms in the returned documents to select representative query terms of trending topics. Specifically, we investigated pair-wise learning to rank models in order to measure a term's discriminative power in making a document rank higher in the returned document list. The discriminative power is combined with the term frequency which denotes the on-topic degree of a term to measure a term's representativeness against a trending topic. Representative terms and their query frequencies are applied to a state-of-the-art data mining model to enhance the effectiveness of trend surveillance. The experimental results based on trending topics of different domains show that our trend surveillance method performs well and the ranking information of search engines are helpful for trend surveillance. In light of this, the proposed method can provide effective support for government officials and authorities in order to help them to respond to fast-changing events and topics, and to make appropriate decisions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.