Effectiveness of Aggregation Methods in Blog Distillation

Mostafa Keikha,Fabio Crestani

doi:10.1007/978-3-642-04957-6_14

Abstract

This paper addresses the blog distillation problem, that is, given a user query find the blogs that are most related to the query topic. We model each post as evidence of the relevance of a blog to the query, and use aggregation methods like Ordered Weighted Averaging operators to combine the evidence. We show that using only highly relevant evidence (posts) for each blog can result in an effective retrieval system. We implement our methods on TREC'06 blog collection with two standard query sets of TREC'07 and TREC'08. Our experiments on the TREC'07 query set show 35% improvement in Mean Average Precision and 22% improvement in Precision@10 over the best applied fusion method to blog distillation. Similar results have been obtained on TREC'08 query set where we have 31% improvement in Mean Average Precision and 20% improvement in Precision@10 over the baseline.

Full Text