Abstract

Connected component labeling is one of the most important processes for image analysis, image understanding, pattern recognition, and computer vision. It performs inherently sequential operations to scan a binary input image and to assign a unique label to all pixels of each object. This paper presents a novel hardware-oriented labeling approach able to process input pixels in parallel, thus speeding up the labeling task with respect to state-of-the-art competitors. For purposes of comparison with existing designs, several hardware implementations are characterized for different image sizes and realization platforms. The obtained results demonstrate that frame rates and resource efficiency significantly higher than existing counterparts are achieved. The proposed hardware architecture is purposely designed to comply with the fourth generation of the advanced extensible interface (AXI4) protocol and to store intermediate and final outputs within an off-chip memory. Therefore, it can be directly integrated as a custom accelerator in virtually any modern heterogeneous embedded system-on-chip (SoC). As an example, when integrated within the Xilinx Zynq-7000 X C7Z020 SoC, the novel design processes more than 1.9 pixels per clock cycle, thus furnishing more than 30 2k × 2k labeled frames per second by using 3688 Look-Up Tables (LUTs), 1415 Flip Flops (FFs), and 10 kb of on-chip memory.

Highlights

  • Machine vision, image processing, and pattern recognition algorithms often require segmented visual objects and/or regions of interest to be identified and analyzed

  • Several implementations were characterized on various devices in terms of running frequency, number of pixels labeled per clock cycle, frame

  • Several implementations were characterized on various devices in terms of running frequency, number of pixels labeled per clock cycle, frame rate, and resource requirements

Read more

Summary

Introduction

Image processing, and pattern recognition algorithms often require segmented visual objects and/or regions of interest to be identified and analyzed To this aim, the well-known connected component labeling (CCL) and connected component analysis (CCA) operations are typically performed [1,2]. While the generic input binary image undergoes its first raster scan for being provisionally labeled, the previously elaborated frame is scanned for the final step, and its output is transferred to the appropriate destination, maximizing the achievable throughput Both input and output data bandwidths are kept limited and both read and write accesses on the image memory are maintained regularly. Obtained results demonstrate that the proposed parallel CCL architecture can process at least 1.88 pixels per clock cycle, which is a throughput that none of the existing counterparts reaches This advantage is obtained with an on-chip memory requirement ranging from 0.375 to 104 kb, which is significantly lower than the referred competitors.

Background and Related Works
Introducing
The Hardware Architecture of the Novel Parallel Labeling Approach
Results
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.