Abstract
Real-time object tracking is an important step of many modern image processing applications. The efficient hardware design of real-time object tracker must achieve the desired accuracy while satisfying the frame rate requirements for a variety of image sizes. The existing methods of visual tracking employ sophisticated algorithms and challenge the capabilities of most embedded architectures. Discriminative scale space tracking is one algorithm that is capable of demonstrating good performance with affordable complexity. It has a high degree of parallelism which can be exploited for efficient implementation of reconfigurable hardware architectures. This paper proposes a real-time implementation of the discriminative scale-space tracker on FPGA for the major blocks. A careful design exploration of core mathematical operations of the tracking algorithm is performed to improve their hardware utilization and timing performance. Among the core functional units optimized in this work, the discrete Fourier transform achieves a computational time improvement of 92% relative to existing works, QR factorization achieves a 2.3times reduction in resource utilization, and singular value decomposition yields a 3.8times improvement in processing time. The proposed data path architecture is designed using Vivado HLS tool set and implemented for Zync Zed Board (xc7z020clg484-1). For an input image size of 320 times 240, the proposed architecture achieves a mean 25.38 fps.
Highlights
Visual object tracking has got significant research interest in recent years
The implementation is given in terms of the major mathematical operations involved including singular value decomposition (SVD), QR, DFT2 and Histogram of gradients (HOG) extractor
discrete Fourier transform (DFT) is implemented by 8-parallel architecture; this is the base for DFT2 unit
Summary
Visual object tracking has got significant research interest in recent years. The purpose of visual tracking is to identify the updated location of the target object in the incoming video sequence, given an initial target location in one frame. The DSST algorithm achieves better performance using separate filters for translation and scale estimation [9, 10]. The updated location is fed to the scale filter to estimate the target size Both filters are updated for the image frame. This section deals with the details of the DSST algorithm [9] and discusses the state-of-the-art implementations for the mathematical operations involved. An image patch f centered around an initial target location I is extracted using a HOG extractor These image features are utilized for learning the target translation Discriminative Correlation Filter (DCF). The scale estimation is obtained by repeating the above steps for scale one-dimensional filter, using the updated target location from translation estimation. The algorithm performs the scaling, translation, filter estimation and update process, as described in (1), (2), (3) and (4).
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have