Abstract

Holistic scene understanding is a challenging problem in computer vision. Most recent researches in this field were focusing on the object detection, the semantic segmentation and the relationship detection tasks. The attribute can provide meaningful information for the object instance, thus the object instance can be expressed more detail in the scene understanding. However, most researches in this field have been limited to several special conditions. Such as, several researches were just focusing on the attribute of special object class, because their solutions were aimed at a limited-scenarios, their methods are hardly to generalize in other scenarios. We also find that most of the research for multi-attribute detection task were only regarding each attribute as binary class and simply use the multi-binary-classifier method for the attribute detection. But these strategies above not consider the relation between each pair of the attributes, they will fall into trouble in the “imperfect” attribute dataset (which is labeled with the missing and incomplete annotations), and they will have low performance in the long-tail attribute class (which has lower rank of annotation and more missing labels). In this paper, we focus on the multi-attribute detection for a variant of object classes and take the relation between attributes into consideration. We propose a GRU-based model to detect a variable-length attribute sequence with a customized loss compute method to solve the “imperfect” attribute dataset problem. Furthermore, we perform ablative studies to prove the effectiveness of each part of our method. Finally, we compare our model with several existed multi-attribute detection methods on VG (Visual Genome) and CUB200 bird datasets to prove the superior performance of the proposed model.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.