Abstract

Semi-supervised crowd counting (SSCC) aims to learn a crowd counting model with limited labeled images and a large number of unlabeled images. Previous works leverage unlabeled images by pseudo-labeling and spatial consistency regularization paradigms, which frequently adopt teacher-student frameworks. However, their performances are readily degraded due to the inconsistent and unreliable pseudo density map in complex crowd scenes. Here, we argue that the SSCC performance can be significantly improved by reducing the over-fitting of the incorrect pseudo labels, and a novel spatial-temporal consistency framework, named STC-Crowd, is proposed. Under different spatial perturbations, spatial consistency enables the counting model to output consistent predictions for the same crowd image. Temporal consistency generates similar feature embedding over adjacent training stages, alleviating the inconsistent issues of pseudo density maps generated for the same image over time. To store the temporal feature embedding of different density levels for temporal consistency, a dynamic temporal knowledge memory (DTKM) is deliberately designed, and considerably reduces the storage cost. Besides, a pseudo-label filter (PLF) mechanism is used to alleviate the negative impact of incorrect pseudo density maps, by reducing the supervision weights of unreliable pseudo labels with high uncertainty. Extensive experiments on four benchmark datasets show that our method obtains competent performance against leading SSCC methods, and especially works better on limited labeled images.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.