Abstract

The task of human parsing aims to segment the human body into different semantic regions. Despite advancements in this field, there are still two issues with current works: boundary indistinction and parsing inconsistency. In this paper, we investigate how to utilize structural information and auxiliary information to jointly solve the above two problems. Drawing inspiration from Transformer architecture, a Boundary-guided Part Reasoning Network (BPRNet) is proposed to combine edge information and associated semantics of body parts for human parsing. Specifically, we design a part representation module to represent human body parts as part features. Based on the Transformer decoder, a multi-head self-attention is used to capture the semantic correlation between the human body. Moreover, we propose a boundary-guided module consisting of absolute boundary attention and reinforced boundary attention. They take advantage of edge information and multi-scale image features to jointly constrain cross-attention to extract global features. Experiments and corresponding results on three public datasets show that the proposed method performs favorably against the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call