Managing large-scale image data becomes an important research issue due to the considerably increasing of digital images of late years. For retrieving images by semantic keywords effectively, annotating appropriate concept labels to the corresponding images in advance is required. Many image annotation approaches and models have been proposed in recent years. However, most of the models only focus on analyzing one of the relationships between image visual features and concept texts. In this paper, all the possible relationships of crossing image and text including image-to-text, text-to-text, and image-to-image are considered and discussed. A set of hybrid learning models based on the proposed cross image---text annotation framework are developed and implemented by means of image classifiers, similarity image matching and association mining of image labels. The goal of experiments is to investigate the performance of the cross image---text framework by evaluating the effectiveness of different annotation models including individual models, bi-hybrid models and the all-hybrid model. The results show that not all-hybrid models can improve the accuracy of image annotation. In general, the hybrid models combining the relationships with both images and text boost the effectiveness of annotation.
Read full abstract