Abstract
Existing image annotation methods that employ convolutional features of deep learning methods from the Internet of Things (IoT) have a number of limitations, including complex training and high space/time expenses associated with the image annotation procedure. Accordingly, this paper proposes an innovative method in which the visual features of the image are presented by the intermediate layer features of deep learning, while semantic concepts are represented by mean vectors of positive samples. Firstly, the convolutional result is directly output in the form of low-level visual features through the mid-level of the pre-trained deep learning model, with the image being represented by sparse coding in the IoT. Secondly, the positive mean vector method is used to construct visual feature vectors for each text vocabulary item, so that a visual feature vector database is created. Finally, the visual feature vector similarity between the testing image and all text vocabulary is calculated, and the vocabulary with the largest similarity is taken from the IoT as the words used for annotation. Experiments on multiple datasets demonstrate the effectiveness of the proposed method; in terms of F1 score, the proposed method's performance on the Corel5k and IAPR TC-12 datasets is superior to that of MBRM, JEC-AF, JEC-DF, and 2PKNN with end-to-end deep features.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.