Abstract
Track-before-detect (TBD) algorithms are used for tracking systems, where the object's signal is below the noise floor (low-SNR objects). A lot of computations and memory transfers for real-time signal processing are necessary. GPGPU in parallel processing devices for TBD algorithms is well suited. Finding optimal or suboptimal code, due to lack of documentation for low-level programming of GPGPUs is not possible. High-level code optimization is necessary and the evolutionary approach, based on the single parent and single child is considered, that is local search approach. Brute force search technique is not feasible, because there are N! code variants, where N is the number of motion vectors components. The proposed evolutionary operator--LREI (local random extraction and insertion) allows source code reordering for the reduction of computation time due to better organization of memory transfer and the texture cache content. The starting point, based on the sorting and the minimal execution time metric is proposed. The unbiased random and biased sorting techniques are compared using experimental approach. Tests shows significant improvements of the computation speed, about 8 % over the conventional code for CUDA code. The time period of optimization for the sample code is about 1 h (1,000 iterations) for the considered recursive spatio-temporal TBD algorithm.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.