Abstract

Effective video-based detection methods are of great importance to intelligent transportation systems (ITS), and here we propose a method to localize and label objects. The method is able to detect pedestrians and bicycle riders in a complex scene. Our method is inspired by the common fate principle, which is a mechanism of visual perception in human beings, and which states tokens moving or functioning in a similar manner tend to be perceived as one unit. Our method embeds the principle in an Implicit Shape Model (ISM). In our method, keypoint-based object parts are firstly detected and then grouped by their motion patterns. Based on the grouping results, when the object parts vote for object centers and labels, each vote belonging to the same object part is assigned a weight according to its consistency with the votes of other object parts in the same motion group. Afterwards, the peaks, which correspond to detection hypotheses on the Hough image formed by summing up all weighted votes, become easier to find. Thus our method performs better in both position and label estimations. Experiments show the effectiveness of our method in terms of detection accuracy.

Highlights

  • In intelligent transportation systems (ITS) areas, detection methods using cameras can be used for navigation, safe driving, surveillance, and sustaining results from other sensors

  • Our work is related to object detection methods which use trajectories [3, 4], methods using the weighting of features [25], methods dealing with codebook noise [17], and methods which integrate temporal information [24]

  • It is very possible that this detection ability benefits from multiple perceptual mechanisms

Read more

Summary

Introduction

In ITS areas, detection methods using cameras can be used for navigation, safe driving, surveillance, and sustaining results from other sensors. Besides the attractive performance and the extendibility of combining various kernels, these methods are favorable because they consider each object as a whole during detection They share limited aspects with visual perception in human beings, and their efficiency heavily relies on the size of the test images. The principle is one of the visual perception principles as theorized by gestalt psychologists, and it states that for human beings, tokens moving coherently are perceptually grouped This provides an intuition to group the object parts by their motion patterns, and let them vote afterwards. Due to the combination of motion analysis results and the Hough transform framework, and by assigning different weights to each object part’s votes, the proposed method has several appealing properties:.

Related Work
Common Fate Hough Transform
Common Fate Weights
Motion Grouping
Results
Codebook
Detection
Campus-scene Detection
Wild-scene Detection
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.