Abstract

Rapid increase in the distribution of multimedia content in recent times presents a challenging problem for content-based image retrieval systems. Image contents, such as position and shape of objects alongside contextual features such as background can be used to retrieve visually similar images. Variations in contrast, color, intensity and texture of contextually similar images make it an interesting research problem. A deep convolutional neural network-based model called MaxNet for content-based image retrieval is presented in this paper. The proposed system bypasses the reliance on handcrafted features and extracts deep features directly from the images, which are then used to retrieve contextually similar images from the database. The proposed MaxNet model is built by stacking the updated inception module in a hierarchical fashion. Features extracted from various pipelines in the inception module are aggregated after each inception maximizing the feature values. This novel aggregation step generates a model that is able to adapt to variety of datasets. Various types of aggregations are discussed in this study. Model overcomes the over-fitting problem by using a dropout layer after each inception block and just before the output layer. The system outputs softmax probabilities, which are stored in the feature database and are used to compute the similarity index to retrieve images similar to the query image. The MaxNet model is evaluated using four popular image retrieval datasets namely, Corel-1k, Corel-5k, Corel-10k and Caltech-101, where it outperforms state-of-the-art methods in key performance indicators.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call