Abstract

Clothing parsing has made tremendous progress in the domain of computer vision recently. Most state-of-the-art methods are based on the encoder-decoder architecture. However, the existing methods mainly neglect problems of feature uncalibration within blocks and semantics dilution between blocks. In this work, we propose an unabridged adjacent modulation network (UAM-Net) to aggregate multi-level features for clothing parsing. We first build an unabridged channel attention (UCA) mechanism on feature maps within each block for feature recalibration. We further design a top-down adjacent modulation (TAM) for decoder blocks. By deploying TAM, high-level semantic information and visual contexts can be gradually transferred into lower-level layers without loss. The joint implementation of UCA and TAM ensures that the encoder has an enhanced feature representation ability, and the low-level features of the decoders contain abundant semantic contexts. Quantitative and qualitative experimental results on two challenging benchmarks (i.e., colorful fashion parsing and the modified fashion clothing) declare that our proposed UAM-Net can achieve competitive high-accurate performance with the state-of-the-art methods. The source codes are available at: https://github.com/ctzuo/UAM-Net.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call