Abstract

In this paper, to better solve the problem of low tracking accuracy caused by the sudden change of target scale, we design and propose an adaptive scale mutation tracking algorithm using a deep learning network to detect the target first and then track it using the kernel correlation filtering method and verify the effectiveness of the model through experiments. The improvement point of this paper is to change the traditional kernel correlation filtering algorithm to detect and track at the same time and to combine deep learning with traditional kernel correlation filtering tracking to apply in the process of target tracking; the addition of deep learning network not only can learn more accurate feature representation but also can more effectively cope with the low resolution of video sequences, so that the algorithm in the case of scale mutation achieves more accurate target tracking in the case of scale mutation. To verify the effectiveness of this method in the case of scale mutation, four evaluation criteria, namely, average accuracy, cross-ratio accuracy, temporal robustness, and spatial robustness, are combined to demonstrate the effectiveness of the algorithm in the case of scale mutation. The experimental results verify that the joint detection strategy plays a good role in correcting the tracking drift caused by the subsequent abrupt change of the target scale and the effectiveness of the adaptive template update strategy. By adaptively changing the number of interval frames of neural network redetection to improve the tracking performance, the tracking speed is improved after the fusion of correlation filtering and neural network, and the combination of both is promoted for better application in target tracking tasks.

Highlights

  • More than 90% of the information that humans use to understand the world comes from vision, and the main goal of computer vision is to enable computers to see the world as humans do. e theory of visual computing has made great progress over the years, and in recent years, with advances in computer hardware and software equipment, computer vision technology has been widely used in various areas of life [1]

  • Computers have to acquire external information through auxiliary devices such as smart cameras, frame receivers, and visual interfaces, etc. ere are two main categories of image information that computers acquire, one of which is static images, mainly pictures, such as face recognition technology, and face recognition is a very popular application in the field of computer vision [2]. e other category is dynamic content, including video and three-dimensional capture, such as video analysis; through the video content analysis, retrieval, extraction of which has the use of information for further processing applications; video analysis is widely used in many fields such as security, transportation, and even retail

  • Target tracking technology is mainly to standardize the information collected by the device, feature extraction, model building, and other operations to obtain the target size and location information, and feedback to the Complexity system model to achieve the application in actual scene

Read more

Summary

Introduction

More than 90% of the information that humans use to understand the world comes from vision, and the main goal of computer vision is to enable computers to see the world as humans do. e theory of visual computing has made great progress over the years, and in recent years, with advances in computer hardware and software equipment, computer vision technology has been widely used in various areas of life [1]. Because visual target tracking technology has more outstanding advantages and wide applications than detection and recognition in the field of computer vision, it has attracted a wave of research from industry and scholars in recent years. E main research of this paper is to adopt the idea of detection before tracking for tracking, based on the kernel correlation filtering algorithm for the problem of the sudden change of target scale to analyze both template update and scale adaption and propose improvement methods to enhance the tracking accuracy and robustness of the algorithm. Wang et al proposed to extract the motion features of multiple videos by neural networks in MDNet (Multidomain Network) based on the idea of transfer learning and migrated the features of the target classification problem to the tracking field. The model learning rate is divided into two parts: fixed learning rate and adaptive learning rate, and different learning rates are selected by judging whether the ratio between the confidence of the current frame tracking result and the maximum confidence of the historical frames is higher than the present update threshold

Study of Target Tracking Algorithm with Adaptive Scale Detection Learning
Evaluation index APE IOU TRE SRE
Analysis of Results
98 Method 3
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call