Due to the increasing popularity of social media platforms, the amount of messages (posts) related to public events, especially posts sharing multimedia content, is steadily increasing. Sharing images can contribute to a rich and live coverage of the event. Yet, despite the value and interestingness of some posts, there is a lot of spam and redundancy, which makes it challenging to select the most important and characteristic posts for the event. In this work, we describe MGraph, a summarization framework that, given a set of social media posts about an event, selects a subset of shared images, simultaneously maximizing their relevance and minimizing their visual redundancy. MGraph employs a topic modelling technique based on different modalities to capture the relevance of posts to event topics, and a graph-based ranking algorithm to produce a diverse ranking of the selected high-relevance images. A user-centred evaluation on a dataset comprising a variety of real-world events demonstrates that MGraph considerably outperforms a number of state-of-the-art summarization algorithms in terms of relevance and diversity (25 and 7 % improvement respectively).
Read full abstract