Abstract

Exploring the characterization laws of image data and improving the efficiency of image data characterization knowledge is essential to promote the development of the Internet of Things technology. Considering that images in the real world usually contain multiple objects, and the objects are closely dependent. For these reasons, it brings great challenges to the robust representation learning of multilabel images. In general, researchers model the relationship between objects based on a class activation map and use graph convolution to mine the dependencies between objects. However, graph structure data often contain noise, which means that the edges between nodes are sometimes not so reliable, and the relative importance of neighbors is also different. Based on this, our goal is to reduce noisy connections and false connections between objects, eliminate multilabel image representation bias, and learn robust representations. Therefore, we propose a robust representation learning method for multilabel images driven by graph attention network (RRL-GAT). Specifically, to reduce the accidental false connection of objects in the image, we propose the class attention graph convolution module (C-GAT) to mine the strong association structure between categories. Besides, for the dynamic correlation between objects in the image, we propose an adaptive graph attention convolution module (A-GAT) to capture the subtle dynamic dependencies in the image. The results on two authoritative data sets show that our method is significantly better than all current state-of-the-art methods. Besides, the visualization results show that RRL-GAT can capture the semantic relationship of a specific input image and has sufficient recognizability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call