Abstract

The scene graph can provide a structured representation for downstream tasks given an image. To generate a fine-grained one, many researchers have attempted to alleviate the long-tailed dataset bias by energy-based learning, causal reasoning or other mechanisms. Nevertheless, they are still restricted to the biased dataset because of the redundancy of head information and the lack of tail information. In this work, we propose a comprehensive and ground-breaking model called Balanced Award-Punishment Model (BAPM) to tackle the problem. The BAPM consists of the stochastic strategy module (SSM), the knowledge transfer module (KTM) and the lateral inhibition loss (LIL). Concretely, the SSM takes dropout to form two different domain spaces for KTM, enhancing the tail information of the object level by constantly learning from another domain. The KTM aims to acquire abundant knowledge of tail predicates by transferring the fine-grained information from one domain to another. The SSM and KTM can be regarded as knowledge awards (KA) due to the incentive for tail data. The LIL mimics the competitive mechanism of neurons to smoothly adjust the weight of each object or each predicate by focal strategy. We regard the LIL as redundancy punishment (RP) owing to taking the restriction for head data into account. Under the joint award-punishment scheme, our approach has achieved state-of-the-art performance on two complementary metrics. The quantitative and qualitative experimental results on Visual Genome dataset show that our BAPM further reduce the dataset bias and generates more diverse scene graphs.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call