Abstract

Automatic image annotation systems available in the literature concatenate color, texture and/or shape features in a single feature vector to learn a set of high level semantic categories using a single learning machine. This approach is quite naive to map the visual features to high level semantic information concerning the categories. Concatenation of many features with different visual properties and wide dynamical ranges may result in curse of dimensionality and redundancy problems. Additionally, it usually requires normalization which may cause an undesirable distortion in the feature space. An elegant way of reducing the effects of these problems is to design a dedicated feature space for each image category, depending on its content, and learn a range of visual properties of the whole image from a variety of feature sets. For this purpose, a two-layer ensemble learning system, called Supervised Annotation by Descriptor Ensemble (SADE), is proposed. SADE, initially, extracts a variety of low-level visual descriptors from the image. Each descriptor is, then, fed to a separate learning machine in the first layer. Finally, the meta-layer classifier is trained on the output of the first layer classifiers and the images are annotated by using the decision of the meta-layer classifier. This approach not only avoids normalization, but also reduces the effects of dimensional curse and redundancy. The proposed system outperforms a state-of-the-art automatic image annotation system, in an equivalent experimental setup.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call