Abstract

Superpixel segmentation algorithms are widely used in the image processing field. The size of the large-scale images usually exceeds the memory of a single machine given that the size of image data has increased rapidly in recent years. This leads to big challenges for implementing sequential superpixel segmentation methods, although these algorithms have good scalability. Additionally, segmentation of large-scale images over a distributed cluster is a feasible solution. Nevertheless, it is challenging to transplant sequential superpixel algorithms directly to a distributed environment, as usually there are incomplete object problems in the border area of image tiles. To overcome the incomplete object problems, one approach is to build a distributed strategy based on a sequential SLIC superpixel segmentation algorithm over a distributed cluster organized by Apache Spark. In our research, the decomposed image tiles were divided into two categories—even tiles and odd tiles. The even tiles were first segmented by the SLIC algorithm, then the cluster centers and buffer sizes of even tiles were extracted and switched to odd tiles. During the shuffle stage, the odd tiles acquired pixels from adjacent even tiles according to the buffer sizes, and then the buffered odd tiles were segmented by the SLIC algorithm with the help of the shared cluster centers. The superpixels with shared cluster centers were generated in even tiles and remained in order to enlarge the odd tiles rather than redundant computing of specific areas to modify incomplete superpixels well. Specifically, this strategy employs the shared variables to transmit intermediate results and the shuffle operations were carried out among approximately half of the entire image tiles, which reduces the communications further. The distributed strategy was evaluated in terms of the accuracy and execution efficiency, which revealed that the proposed strategy could not only get better F-measure values but is also implemented faster relative to the repeat calculation strategy, especially for limited calculation resources. Therefore, the proposed strategy is more suitable for superpixel segmentation algorithms. In addition, this research accumulates experience for expanding the abundant sequential algorithms to the distributed environment and provides more solutions for large-scale image processing demands.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call