Abstract

Robust local image features have become crucial components of many state-of-the-art computer vision algorithms. Due to limited hardware resources, computing local features on embedded system is not an easy task. In this paper, we propose an efficient parallel computing framework for speeded-up robust features with an orientation towards multi-DSP based embedded system. We optimize modules in SURF to better utilize the capability of DSP chips. We also design a compact data layout to adapt to the limited memory resource and to increase data access bandwidth. A data-driven barrier and workload balance schemes are presented to synchronize parallel working chips and reduce overall cost. The experiment shows our implementation achieves competitive time efficiency compared with related works.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call