Image segmentation is essential to image processing because it provides a solution to the task of separating the objects in an image from the background and from each other, which is an important step in object recognition, tracking, and other high-level image-processing applications. By partitioning the input image into smaller regions, segmentation performs the balancing act of extracting the main areas of interest (objects and important features) that further help to interpret the image, while remaining immune to irrelevant noise and less important background scenes. Image-segmentation applications branch off into a plethora of domains, from decision-making applications in computer vision to medical imaging and quality control to name just a few. The mean-shift algorithm provides a unique unsupervised clustering solution to image segmentation, and it has an established record of good performance for a wide variety of input images. However, mean-shift segmentation exhibits an unfavorable computational complexity of $$O(kN^2)$$ , where $$N$$ represents the number of pixels and $$k$$ the number of iterations. As a result of this complexity, unsupervised image segmentation has had limited impact in autonomous applications, where a low-power, real-time solution is required. We propose a novel hardware architecture that exploits the customizable computing power of FPGAs and reduces the execution time by clustering pixels in parallel while meeting the low-power demands of embedded applications. The architecture performance is compared with existing CPU and GPU implementations to demonstrate its advantages in terms of both execution time and energy.
Read full abstract