Abstract

The microblog has become a new global hot spot. Information retrieval (IR) technologies are necessary for accessing the massive amounts of valuable user-generated contents in the microblog sphere. The challenge in searching relevant microblogs is that they are usually very short with sparse vocabulary and may fail to match queries. Pseudo-relevance feedback (PRF) via query-expansion has been proven in previous studies to successfully increase the number of matches in IR. However, a critical problem of PRF is that the pseudo-relevant feedback may not be truly relevant, and thus may introduce noise to query expansion. In this paper, we exploit the dynamic nature of microblogs to address this problem. We first present a novel dynamic PRF technique, which is capable of expanding queries with truly relevant keywords by extracting representative terms based on the query’s temporal profile. Next we present query expansion from external knowledge sources based on negative and positive feedbacks. We further consider that the choice of PRF strategy is query-dependent. A two-level microblog search framework is presented. At the high level, a temporal profile is constructed and categorized for each query; at the low level, hybrid PRF query expansion combining dynamic and external PRF is adopted based on the query category. Experiments on a real data set demonstrate that the proposed method significantly increases the performance of microblog searching, compared with several traditional retrieval models, various query expansion methods and state-of-art recency-based models for microblog searching.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.