Abstract

Traditional zero-shot object-detection algorithms detect images of untrained classes in the model with the help of semantic embedding. However, these approaches may perform poorly due to the limitations of fixed semantic embedding. Given that fixed semantic attributes lead to a lack of generalization capabilities in the model, a semantic enhancement mechanism is proposed to update the semantic embedding, which is used to serve the needs of the visual space. Specifically, considering that the original semantic space is not enough to construct a visual-semantic mapping relationship, an augmented semantic embedding (ASE) approach is designed to supplement semantic attribute information. Then, a semantic channel attention mechanism is used to adjust the ASE. The adjustment strategy retains adequate attribute information, which is highly relevant to visual features. Finally, to alleviate the domain shift problem, a clustering association strategy is introduced to establish an inferred relationship, which ensures that the predictor is generalized to the unseen domain during training. The superiority of the proposed method is demonstrated by the MS-COCO and PASCAL VOC datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call