Abstract

Moving object detection and tracking from image sequences has been extensively studied in a variety of fields. Nevertheless, observing geometric attributes and identifying the detected objects for further investigation of moving behavior has drawn less attention. The focus of this study is to determine moving trajectories, object heights, and object recognition using a monocular camera configuration. This paper presents a scheme to conduct moving object recognition with three-dimensional (3D) observation using faster region-based convolutional neural network (Faster R-CNN) with a stationary and rotating Pan Tilt Zoom (PTZ) camera and close-range photogrammetry. The camera motion effects are first eliminated to detect objects that contain actual movement, and a moving object recognition process is employed to recognize the object classes and to facilitate the estimation of their geometric attributes. Thus, this information can further contribute to the investigation of object moving behavior. To evaluate the effectiveness of the proposed scheme quantitatively, first, an experiment with indoor synthetic configuration is conducted, then, outdoor real-life data are used to verify the feasibility based on recall, precision, and F1 index. The experiments have shown promising results and have verified the effectiveness of the proposed method in both laboratory and real environments. The proposed approach calculates the height and speed estimates of the recognized moving objects, including pedestrians and vehicles, and shows promising results with acceptable errors and application potential through existing PTZ camera images at a very low cost.

Highlights

  • In the field of computer vision, detecting and tracking moving objects has been widely studied for decades

  • A detector-agnostic procedure was developed by integrating both unsupervised and supervised (deep learning convolutional neural networks (CNN)) techniques to extract the detected and verified targets through the fusion and data association steps [2]

  • This study focuses on the spatial information processing of object geometry estimation in Pan Tilt Zoom (PTZ)

Read more

Summary

Introduction

In the field of computer vision, detecting and tracking moving objects has been widely studied for decades. The segmenting-based methods, such as mean shift clustering, graph-cuts, and active contours, divide the images into perceptually similar regions Supervised classification methods, such as support vector machine, neural networks, and adaptive boosting techniques, are trained to detect the features of the objects [21]. A more intuitive method is the background subtraction method in which algorithms can be categorized into recursive and non-recursive methods [24] These algorithms can provide more comprehensive object information by finding the variations in the image background model provided that the precise background has been known [25,26,27]. 3D scene flow has been introduced to form a dense 3D motion field for object detection, but stereo or multiple camera configurations are typically required to obtain depth information of the scene [28,29]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.