Active Tracking of Moving Objects with Speed Variations Using a Novel PTZ Camera-based Machine Vision Technique
Most existing methods of active tracking focus mainly on slow-moving objects, resulting in limited adaptability to objects with variable speeds. To bridge this research gap, a novel spherical coordinate guided adaptive active tracking (SCAAT) approach based on the pan–tilt–zoom (PTZ) camera machine vision system is proposed in this study. For object detection and tracking, YOLOv5 and DeepSORT are employed in the PTZ vision system. The spherical coordinates and angular speeds of the moving object can be acquired under the spherical coordinate system. For practical application, the start-time and start-angle delay of the PTZ cameras are calibrated, and a speed control equation is conducted in the spherical coordinate system to reduce rapid location deviation between the PTZ and the moving object. To adapt different speeds of the object and avoid camera shaking under different zooms, an adaptive tracking window is designed to keep the object within the camera field of view. Experimental testing has been performed to evaluate the proposed SCAAT method. The results indicate that the SCATT can not only expand the effective following distance and zoom of the PTZ camera, but also effectively improve the accuracy and stability of active tracking for the moving object with large speed variations.
- Conference Article
- 10.1109/iccst53801.2021.00129
- Nov 1, 2021
1072 samples of Chinese academic documents were obtained after adopting the document retrieval rule model to search the “Pan-Tilt Zoom” academic information on the China National Knowledge Infrastructure (CNKI) through fuzzy retrieval with “Pan-Tilt Zoom” as the keyword and after combining and organizing the retrieval results. Based on the statistical model of academic documents, a total of 9 types of statistical data and 19 understandings were gained. It is found that the samples entail five research concentrations and 29 topics. Then based on the seeded academic documents, the “Pan-Tilt Zoom” academic results and overall situation of academic studies were summarized. Finally, this paper points out that China should strengthen the studies on machine vision and “Pan-Tilt Zoom” structure while continuing the basic theory studies on “Pan-Tilt Zoom”.
- Research Article
16
- 10.1007/s11042-018-6104-4
- Jul 6, 2018
- Multimedia Tools and Applications
The alertness of terrorism in the present is greater than that in the past b with reference to the incident of September 11. Still now, there has been a fight against terrorism and that has triggered a novel effort to locate the enhanced approaches with a higher-end camera. A Pan Tilt Zoom (PTZ) camera, which is a type of such high-end camera with multi-functionalities, can be used for identifying such potential threats. Consequently, the background modeling has an increasing significance in the computer vision to segment the foreground objects for further analysis in video surveillance applications. A PTZ camera offers a lot of benefits over normal fixed cameras. It provides an easy installation with 360° plane and greater flexibility. Although numerous surveys on static camera methods have already been proposed to model background, these methods do not adopt maximized large-scale scene coverage as well as frame quality to recognize specific targets compared to the PTZ camera. This motivates the survey to address the issues and techniques related to the PTZ background modeling, since there is no survey on this emerging area. The sole objective of this paper is to present a brief survey on the PTZ camera-based foreground segmentation method, which is very indispensable for high level analysis. It also provides an overview of various techniques from the literature that addresses the challenges, solutions, key aspects of the PTZ camera-based foreground segmentation methods, categorization of different approaches as well as the available datasets used for experimentation, and important future scope along with left over challenges for the computer vision researchers with applications.
- Conference Article
4
- 10.1109/iccsit.2010.5565067
- Jul 1, 2010
For enlarging the surveillance area, more and more visual surveillance systems exploit Pan Tilt Zoom (PTZ) camera. This paper proposes a framework of surveillance system which uses a single PTZ camera. The framework is divided into two phases: offline phase and inline phase. During the offline phase, camera parameters for every image are computed using SIFT features and bundle adjustment algorithm, and then the mosaic and the background model of the whole area are generated based on the camera parameters. During the inline phase, the real-time frame is projected to the correct location on the mosaic using SIFT features and bundle adjustment algorithm, and then the moving object is detected by background subtraction technical. Experiments show that the PTZ camera's parameters can be computed in time and the moving object can be detected perfectly even when the zoom value changes a lot.
- Conference Article
3
- 10.1109/ccece.2010.5575246
- May 1, 2010
This paper outlines the control algorithms used in a pan / tilt / zoom (PTZ) tracking system for an Unmanned Ground Vehicle (UGV), implemented as part of a computer-vision based autonomous convoying system. The system relies upon a Linear Quadratic Gaussian controller to keep the target centered in the camera's field of view using the pan and tilt degrees of freedom. A novel zoom controller, based upon the current target size and various measures of system noise, controls the camera focal length to maintain an appropriate image size for visual tracking, despite changing distances between the leader and follower vehicles.
- Conference Article
2
- 10.1117/12.818378
- Apr 13, 2009
We describe a novel scalable approach for the management of a large number of Pan-Tilt-Zoom (PTZ) cameras deployed outdoors for persistent tracking of humans and vehicles, without resorting to the large fields of view of associated static cameras. Our system, Active Collaborative Tracking - Vision (ACT-Vision), is essentially a real-time operating system that can control hundreds of PTZ cameras to ensure uninterrupted tracking of target objects while maintaining image quality and coverage of all targets using a minimal number of sensors. The system ensures the visibility of targets between PTZ cameras by using criteria such as distance from sensor and occlusion.
- Conference Article
6
- 10.1109/robio.2010.5723599
- Dec 1, 2010
The concept of active tracking is presented to simulate the characteristics of human vision in intelligent visual surveillance. The Pan/Tilt/Zoom (PTZ) camera is generally used for active tracking. In this paper, we present a novel and effective approach for active object tracking with a PTZ camera, and construct a near real-time system for indoor and outdoor scenes. The tracking algorithm of our system is based on the feature matching, with the PID control to drive the camera. The feature extracted from moving people is described as a region covariance matrix which combines the spatial and statistical properties of the targets (e.g. coordinates, color, and gradient). Results from indoor and outdoor experiments demonstrate the effectiveness and accuracy of our approach.
- Conference Article
1
- 10.1117/12.946940
- Oct 24, 2012
Tracking and surveillance systems frequently require identification of targets in the field of view of a camera. For example, in the acquisition phase of an FSO system an optical beacon is often used. Pan-tilt-zoom (PTZ) camera networks are also increasingly finding their way into surveillance systems. With advances in image processing and computational efficiency, such systems can track the 3D coordinates of a target in real-time as it traverses through the field of view (FOV) of the master camera. Master-slave relationships between static-dynamic camera pairs are well-studied problems. These systems initially calibrate for a linear mapping between pixels of a wide FOV camera and the PT settings of a dynamic narrow FOV camera. As the target travels through the FOV of the master camera, the slave cameras PT settings are then adjusted to keep the target centered within its FOV. In this paper, we describe a system that allows both cameras to move and extract the 3D coordinate of the target. This is done with only a single initial calibration between pairs of cameras and high- resolution PTZ platforms to keep track of the master camera movement. The mapping between PT settings of the slave and master camera is then adjusted based on the movement by the dynamic camera. This results in a larger coverage area for the master camera to be able to keep track of the target over longer periods of time. Using the information from the PT settings of the PTZ platform as well as the precalibrated settings from a preset zoom lens, the 3D coordinate of the target is extracted and compared to those of a laser range finder and static-dynamic camera pair accuracies.
- Conference Article
7
- 10.1109/isie.2013.6563833
- May 1, 2013
Camera positioning units for surveillance applications are often mounted on mobile supports or vehicles. In such circumstances, the motion of the supporting base affects the camera field of view, thus making the task of pointing and tracking a specific target problematic, especially when using low cost devices that are usually not equipped with rapid actuators and fast video processing units. Visual tracking capabilities can be improved if the camera field of view is preliminarily stabilized against the movements of the base. Although some cameras available on the market are already equipped with an optical image stabilization (OIS) system, implemented either in the camera lenses or in the image sensor, these are usually too expensive to be installed on low-end positioning devices. A cheaper approach to image stabilization consists of stabilizing the camera motion using the motors of the positioning unit and the inertial measurements provided by a low-cost MEMS Inertial Measurement Unit (IMU). This paper explores the feasibility of applying such image stabilization system to a low cost pan-tilt-zoom (PTZ) camera positioning unit driven by hybrid stepper motors (HSMs), in order to aid the task of pointing and tracking of a specific target on the camera image plane. In the proposed solution, a two-level cascaded control structure, consisting of inner inertial stabilizing control loop and an outer visual servoing control loop, is used to control the PTZ unit. Several tests are carried out on a real device mounted on a moving table actuated by a 6 degrees-of-freedom pneumatic hexapod. Realistic motions are recreated by using the data recordings taken aboard of a patrolling ship.
- Research Article
4
- 10.2312/pe/eurovisshort/eurovisshort2012/043-047
- Jan 1, 2012
Parallel coordinates is one of the most popular and widely used visualization techniques for large, high dimensional data. Often, data attributes are visualized on individual axes with polylines joining them. However, some data attributes are more naturally represented with a spherical coordinate system. We present a novel coupling of parallel coordinates with spherical coordinates, enabling the visualization of vector and multi-dimensional data. The spherical plot is integrated as if it is an axis in the parallel coordinate visualization. This hybrid visualization benefits from enhanced visual perception, representing vector data in a more natural spatial domain and also reducing the number of parallel axis within the parallel coordinates plot. This raises several challenges which we discuss and provide solutions to, such as, visual clutter caused by over plotting and the computational complexity of visualizing large abstract, time-dependent data. We demonstrate the results of our work-in-progress visualization technique using biological animal tracking data of a large, multi-dimensional, time-dependent nature, consisting of tri-axial accelerometry samples as well as several additional attributes. In order to understand marine wildlife behavior, the acceleration vector is reconstructed in spherical coordinates and visualized alongside with the other data attributes to enable exploration, analysis and presentation of marine wildlife behavior.
- Research Article
5
- 10.1016/j.cviu.2018.09.005
- Sep 25, 2018
- Computer Vision and Image Understanding
Active target tracking: A simplified view aligning method for binocular camera model
- Research Article
26
- 10.1109/tip.2019.2894940
- Jan 24, 2019
- IEEE Transactions on Image Processing
Being able to cover a wide range of views, pan-tilt-zoom (PTZ) cameras have been widely deployed in visual surveillance systems. To achieve a global-view perception of a surveillance scene, it is necessary to generate its panoramic background image, which can be used for the subsequent applications such as road segmentation, active tracking, and so on. However, few works have been reported on this problem, partially due to the lack of benchmark dataset and the high complexity of panoramic image generation of PTZ cameras. In this paper, we build, for the first time to our best knowledge, a benchmark PTZ camera dataset with multiple views, and derive a complete set of panoramic transformation formulas for PTZ cameras. We further propose a fast multi-band blending method to address the efficiency issue in panoramic image fusion and mosaicing. Some related panoramic transformations are also developed, such as cylindrical and overlooking transformations. Our proposed approach exhibits impressive accuracy and efficiency in PTZ panorama generation as well as panoramic image mosaicing.
- Research Article
- 10.1088/1361-6501/adaf4a
- Feb 17, 2025
- Measurement Science and Technology
Among the various defects that inevitably occur on expressways, cracks are the most common and significant indicators of highway pavement damage. Timely and accurate crack recognition is urgently required for highway maintenance. Highway pavement crack recognition depends primarily on human visual inspection and expressway maintenance vehicles. These approaches are very time-consuming, labor-intensive, and difficult to implement in civil engineering. Cost-effective fixed pan/tilt/zoom (PTZ) camera–based crack recognition was investigated in our previous work. However, for pavement cracks captured at long camera distances, the limited resolution of the PTZ vision generates low-resolution crack images. In addition, the cracks exhibit weak features, such that the crack pixels density and distribution are significantly affected by the background noise, making it challenging to recognize these cracks. Aiming to solve these problems, a high-order kernel-based modified bicubic interpolation is proposed to typically reveal and characterize discrete pixel variations, obtain high-quality super-resolution crack images, and improve the recognition performance of cracks. Extensive experiments with respect to the crack datasets captured by PTZ cameras on G4/highways in China are conducted to verify the performance of the proposed method. Two measurement parameters, namely Just Noticeable Blur (JNB) and Structural Similarity Index, confirm the high quality of the super-resolution crack images. Experimental comparisons demonstrate that super-resolution crack image-based crack recognition achieves out-performance, such that the mAP, precision (P), recall (R), and F 1 -score are increased to 95.3 % , 97.3 % , 96.1 % , and 97.4%, respectively. This method proves the feasibility of high-efficiency crack recognition using modified bicubic interpolation for fixed PTZ vision-based expressways maintenance engineering.
- Conference Article
1
- 10.1109/avss.2006.19
- Nov 1, 2006
We present a system for automatic head tracking with a single pan-tilt-zoom (PTZ) camera. In distance education the PTZ tracking system developed can be used to follow a teacher actively when s/he moves in the classroom. In other videoconferencing applications the system can be utilized to provide a close-up view of the person all the time. Since the color features used in tracking are selected and updated online, the system can adapt to changes rapidly. The information received from the tracking module is used to actively control the PTZ camera in order to keep the person in the camera view. In addition, the system implemented is able to recover from erroneous situations. Preliminary experiments indicate that the PTZ system can perform well under different lighting conditions and large scale changes.
- Conference Article
1
- 10.1109/avss.2011.6027370
- Aug 1, 2011
During the last years, the need for security-oriented surveillance systems has grown higher and higher. Nowadays many public environments, such as airports, train stations, etc. are monitored by some sort of video-surveillance system in order to detect or prevent security issues. The involved technology ranges from the use of plain closed-circuit cameras (CCTV) to sophisticated computer-based video processing systems. The CCTV approach has been the only feasible choice in the past, and it is still widely used, however its limits are more and more evident: the increase of the number of sensors (modern surveillance systems can use hundreds of cameras) is often not matched by an adequate number of human operators, whose attention is spread on many different tasks and quickly decreases over time. Modern computer-based systems try to face these problems using automatic video analysis and understanding techniques, in order to cover wide areas and simultaneously highlight only the potential security issues and thus requiring the attention of a human operator only in a limited number of cases (e.g. [6, 5]). The research in this field has been very active and produced many techniques for video analysis and interpretation, but many works are limited to the use of static cameras. Only recently the research community started focusing on more sophisticated sensors like Pan-Tilt-Zoom (PTZ) cameras, and the research on dynamic, active networks of PTZ cameras is still limited (for an example of some recent works in this field, see [1]). Many of these works focus on exploiting the dynamic features of a network of PTZ cameras to improve tracking performance [3, 4, 13, 10, 12], while relatively few works address the problem of optimizing the camera coverage of the monitored area according to specific criteria. Angella et al. [2] propose a method to maximize the area coverage by using a 3D model of the observed zone, but their work only aims at finding a good initial camera displacement, which cannot be dynamically modified according to the observed data. Mittal and Davis [8, 7] also consider the presence of dynamic occluding objects in order to evaluate the visibility of the scene. Piciarelli et al. [11] propose a method to automatically and dynamically reconfigure the camera orientations and zoom levels using an Expectation-Maximization-based approach.
- Research Article
121
- 10.1109/tip.2012.2188806
- Feb 23, 2012
- IEEE Transactions on Image Processing
The performance of dynamic scene algorithms often suffers because of the inability to effectively acquire features on the targets, particularly when they are distributed over a wide field of view. In this paper, we propose an integrated analysis and control framework for a pan, tilt, zoom (PTZ) camera network in order to maximize various scene understanding performance criteria (e.g., tracking accuracy, best shot, and image resolution) through dynamic camera-to-target assignment and efficient feature acquisition. Moreover, we consider the situation where processing is distributed across the network since it is often unrealistic to have all the image data at a central location. In such situations, the cameras, although autonomous, must collaborate among themselves because each camera's PTZ parameter entails constraints on the others. Motivated by recent work in cooperative control of sensor networks, we propose a distributed optimization strategy, which can be modeled as a game involving the cameras and targets. The cameras gain by reducing the error covariance of the tracked targets or through higher resolution feature acquisition, which, however, comes at the risk of losing the dynamic target. Through the optimization of this reward-versus-risk tradeoff, we are able to control the PTZ parameters of the cameras and assign them to targets dynamically. The tracks, upon which the control algorithm is dependent, are obtained through a consensus estimation algorithm whereby cameras can arrive at a consensus on the state of each target through a negotiation strategy. We analyze the performance of this collaborative sensing strategy in active camera networks in a simulation environment, as well as a real-life camera network.
- Research Article
- 10.5755/j02.eie.40003
- Jun 27, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.40870
- Jun 27, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.40836
- Jun 27, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.42747
- Jun 27, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.39824
- Jun 27, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.40523
- Jun 23, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.40069
- Apr 23, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.40016
- Apr 23, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.38758
- Apr 23, 2025
- Elektronika ir Elektrotechnika
- Research Article
- 10.5755/j02.eie.40795
- Apr 23, 2025
- Elektronika ir Elektrotechnika
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.