Abstract
The key procedure is to accurately identify pedestrians in complex scenes and effectively embed features from multiple vision cues. However, it is still a limitation to coordinate two tasks in the unified framework, thus leading to high computational overhead and unsatisfactory search performance. Furthermore, most methods do not take significant clues and key features of pedestrians into consideration. To remedy these issues, we introduce a novel method named Multi-Attention-Guided Cascading Network (MGCN) in this paper. Specifically, we obtain the trusted bounding box through the detection header as the label information for post-process. Based on the end-to-end network, we demonstrate the advantages of jointly learning to construct the bounding box and attention module by maximizing the complementary information from different attention modules, which can achieve optimized person search performance. Meanwhile, by imposing an aligning module on re-id feature extracted network to locate visual clues with semantic information, which can restrain redundant background information. Extensive experimental results for the two benchmark person search datasets are provided to demonstrate that the proposed MGCN markedly outperforms the state-of-the-art baselines.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.