Abstract

Unsupervised domain adaptation crowd counting (UDACC) has been studied with practical research utility by getting rid of the labeling burden on large-scale dense crowds in the target domain. Current methods generalize well within the specific domain gap by directly aligning domain distributions or translating synthetic data to realistic images. However, it is difficult to define domain gaps among complex real-world datasets, in which the images vary greatly in style, density level and/or content. To tackle this problem, in this paper, we propose a Cross-Domain Attention Network (CDANet), which can effectively generalize the model to the unlabeled domain on both unsupervised synthetic-to-realistic and realistic-to-realistic crowd counting. Specifically, we propose a Cross-Domain Attention Module (CDAM) to learn domain-related information between the source and target domain, which extracts relations in cross-domain attentive information, thus enhancing crowd-informative features. Moreover, to make our CDAM invariant to domain shifts, we introduce a consistency penalty to ensure that the attention maps are consistent before and after the domain shifting. Thus our CDANet can pay attention to the shared counting information across domains, while remaining its invariant ability during domain adaptation. Extensive experiments on several common benchmarks for UDACC demonstrate that our CDANet gets competitive results on both unsupervised synthetic-to-realistic and realistic-to-realistic UDACC tasks.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call