Abstract

This paper presents a multimodal latent topic analysis method for the construction of image collection summaries. The method automatically selects a set of prototypical images from a large set of retrieved images for a given query. We define an image collection summary as a subset of images from a collection, which is visually and semantically representative. To build such a summary we propose MICS (Multimodal Image Collection Summarization), a method that combines textual and visual modalities in a common latent space, which allows to find a subset of images from which the whole collection can be reconstructed. Experiments were conducted on two collections of tagged images demonstrating the ability of the approach to build summaries with representative visual and semantic contents. The method was evaluated using objective measures, reconstruction error and diversity of the summary, showing competitive results when compared to other summarization approaches.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call