Abstract

Multiresolution Gabor filter banks are used for feature extraction in a variety of applications as Gabor filters have shown to be exceptional feature extractors with a close correspondence to the simple cells in the primary visual cortex (V1) of the brain. Yet applying the Gabor filter is a computationally intensive task. Most applications that utilize the Gabor feature space require real time results; however, the large quantity of computations involved has hindered systems from achieving real time performance. The natural solution for such compute intensive tasks is parallelization. FPGAs have emerged as attractive platforms for compute intensive signal processing applications due to their massively parallel computation resources as well as low power consumption and affordability. We present a configurable architecture for Gabor feature extraction on FPGA that enhances the resource utilization of the FPGA hardware fabric while maintaining a streaming data flow to yield exceptional performance. The increased resource utilization resulting from configurability, optimizations, and resource sharing allows for higher levels of parallelism to achieve real time feature extraction of high resolution images. Two architectures are introduced. The first is an architecture for multiresolution feature extraction with extensive resource sharing for enhanced resource utilization. The second is an architecture for many-orientation applications using a coarse to fine grain method to enhance resource utilization by reducing the number of filters applied at different orientations. Our results show that our multiresolution implementation achieves real-time performance on 2048?×?1526 images and exhibits 6X speed up over a GPU implementation while exhibiting energy efficiency with 0.4fps/W compared to the GPU that achieves 0.036fps/W.[1] The implementation for many-orientation applications using the coarse to fine grain method exhibits resource saving of at most 2 O $$ 2\sqrt{O} $$ for O number of orientations and higher, compared to a fully parallel architecture and 25× speedup compared to a GPU implementation for 16 orientations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call