Abstract
This paper focuses on detecting vehicles in different target scenes with the same pre-trained detector which is very challenging due to view variations. To address this problem, we propose a novel approach for detection adaptation based on scene transformation, which contributes in both view transformation and automatic parameter estimation. Instead of modifying the pre-trained detectors, we transform scenes into frontal/rear view handling with pitch and yaw view variations. Without human interactions but only some general prior knowledge, the transformation parameters are automatically initialized, and then online optimized with spatial–temporal voting, which guarantees that the transformation matches the pre-trained detector. Since there is no need of labeling new samples and manual camera calibration, our approach can considerably reduce manual interactions. Experiments on challenging real-world videos demonstrate that our approach achieves significant improvements over the pre-trained detector, and it is even comparable to the performance of the detector trained on fully labeled sequences.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.