Abstract

Learning the correlation among labels is a standing-problem in the multi-label image recognition task. The label correlation is the key to solve the multi-label classification but it is too abstract to model. Most solutions try to learn image label dependencies to improve multi-label classification performance. However, they have ignored two more realistic problems: object scale inconsistent and label tail (category imbalance). These two problems will impact the bad influence on the classification model. To tackle these two problems and learn the label correlations, we propose feature attention network (FAN) which contains feature refinement network and correlation learning network. FAN builds top-down feature fusion mechanism to refine more important features and learn the correlations among convolutional features from FAN to indirect learn the label dependencies. Following our proposed solution, we achieve performed classification accuracy on MSCOCO 2014 and VOC 2007 dataset.

Highlights

  • Multi-label image classification aims to recognize the different objects or attributes in images

  • To challenge the multi-label image classification task, we proposed Feature Attention Network to mine more representative features and learn label correlation based on self-attention mechanism

  • Correlation Learning Network integrates the multi-scale features from Feature Refinement Network

Read more

Summary

INTRODUCTION

Multi-label image classification aims to recognize the different objects or attributes in images. Correlation Learning Network integrates the multi-scale features from Feature Refinement Network It can explicitly exploit the feature intensity and spatial information to get the new feature which considers label correlation and further solves the object scale inconsistent and label tail problem. The top-level features of deep neural networks have rich semantic information and have small size but larger receptive field that is useful to recognize bigger objects. We point out how we solve object scale inconsistent and label tail problem using Feature Refinement Network. The weighted vector will serve as a attentionvector to recalibrate feature This can highlight features that are useful for small objects and tail label recognition. Global max pooling is used in Feature Refinement Block to capture context information These is benefit to recognize small objects and tail label.

LOSS FUNCTION
EXPERIMENTS
DATASET MSCOCO2014
IMPLEMENTATION DETAILS Our deep neural model contains two parts
EVALUATION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.