(Retracted) Mimicking human vision systems: deep-learning-based feature fusion for semantic image retrieval

Zhongzhe Chen,Luming Zhang

doi:10.1117/1.jei.32.6.062509

Abstract

Cross-form feature combination is an important multifeature fusion technique where the purpose is to implicitly discover the relationship between samples from different modalities, i.e., to retrieve another image encoded by similar semantics through one example image. In the past decade, cross-modal image retrieval has becoming a hotspot investigated by many academicians. Moreover, it is now a significant tool for the future performance enhancement of image retrieval. A long-short term memory (LSTM)-based feature fusion model is proposed. First, aiming at the competitiveness of nonmixed deep architecture for image retrieval, the mechanism of LSTM is introduced in detail. Among them, ground-truth-based methods are used to improve cross-modality. We notice that LSTM can mimic human visual understanding of image semantics well. To improve the accuracy of oblique-form image retrieval, systems based on binary representation are proposed to improve cross-modal similarity and effectiveness of message recovery. Second, we use a quality model to measure the commonly used image low-/high-level visual features, where the disqualified features are abandoned accordingly. This in turn achieves an optimal set of highly descriptive features for image retrieval. Furthermore, we use LSTM and the refined visual features to build a biological model for image retrieval, wherein the multimodel features can be optimally incorporated at the temporal level. Extensive experimental validations on multiple well-known image sets have shown the superiority of our method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

(Retracted) Mimicking human vision systems: deep-learning-based feature fusion for semantic image retrieval

Abstract

Talk to us

Similar Papers

More From: Journal of Electronic Imaging

Lead the way for us

Similar Papers

Multimodal learning with only image data: A deep unsupervised model for street view image retrieval by fusing visual and scene text features of images
Shangyou Wu ... Yifan Zhang
Transactions in GIS | VOL. 28
Shangyou Wu, et. al.Shangyou Wu ... Yifan Zhang
23 Feb 2024
Transactions in GIS | VOL. 28

Image retrieval: Benchmarking visual information indexing and retrieval systems
Abebe Rorissa
Bulletin of the American Society for Information Science and Technology | VOL. 33
Abebe RorissaAbebe Rorissa
01 Feb 2007
Bulletin of the American Society for Information Science and Technology | VOL. 33

Deep Supervised Hashing by Fusing Multiscale Deep Features for Image Retrieval
Adil Redaoui ... Kamel Belloulata
Information | VOL. 15
Adil Redaoui, et. al.Adil Redaoui ... Kamel Belloulata
05 Mar 2024
Information | VOL. 15

End-to-end multimodal clinical depression recognition using deep neural networks: A comparative analysis
Muhammad Muzammel ... Alice Othmani
Computer Methods and Programs in Biomedicine | VOL. 211
Muhammad Muzammel, et. al.Muhammad Muzammel ... Alice Othmani
28 Sep 2021
Computer Methods and Programs in Biomedicine | VOL. 211

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

(Retracted) Mimicking human vision systems: deep-learning-based feature fusion for semantic image retrieval

Abstract

Talk to us

Similar Papers

More From: Journal of Electronic Imaging