Abstract

Data stream partitioning is a fundamental and important mechanism for distributed systems. However, use of an inappropriate partition scheme may generate a data skew problem, which can influence the execution efficiency of many application tasks. Processing of skewed partitions usually takes a longer time, need more computational resources to complete the task and can even become a performance bottleneck. To solve such data skew issues, this paper proposes a novel partition method to divide on demand the image tiles uniformly into partitions. The partitioning problem is then transformed into a uniform and compact clustering problem whereby the image tiles are regarded as image pixels without spectrum and texture information. First, the equal area conversion principle was used to select the seed points of the partitions and then the image tiles were aggregated in an image layout, thus achieving an initial partition scheme. Second, the image tiles of the initial partition were finely adjusted in the vertical and horizontal directions in separate steps to achieve a uniform distribution among the partitions. Two traditional partition methods were adopted to evaluate the efficiency of the proposed method in terms of the image segmentation testing, data shuffle testing, and image clipping testing. The results demonstrated that the proposed partition method solved the data skew problem observed in the hash partition method. In addition, this method is designed specifically for processing of image tiles and makes the related processing operations for large-scale images faster and more efficient.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.