Abstract

We present efficient sequential and parallel algorithms for the maximum sum (MS) problem, which is to maximize the sum of some shape in the data array. We deal with two MS problems; the maximum subarray (MSA) problem and the maximum convex sum (MCS) problem. In the MSA problem, we find a rectangular part within the given data array that maximizes the sum in it. The MCS problem is to find a convex shape rather than a rectangular shape that maximizes the sum. Thus, MCS is a generalization of MSA. For the MSA problem, O ( n ) time parallel algorithms are already known on an ( n , n ) 2D array of processors. We improve the communication steps from 2 n − 1 to n, which is optimal. For the MCS problem, we achieve the asymptotic time bound of O ( n ) on an ( n , n ) 2D array of processors. We provide rigorous proofs for the correctness of our parallel algorithm based on Hoare logic and also provide some experimental results of our algorithm that are gathered from the Blue Gene/P super computer. Furthermore, we briefly describe how to compute the actual shape of the maximum convex sum.

Highlights

  • We face a challenge to process a big amount of data in the age of information explosion and big data [1]

  • We provide rigorous proofs for the correctness of our parallel algorithm based on Hoare logic and provide some experimental results of our algorithm that are gathered from the Blue Gene/P super computer

  • We show how the programming techniques lead to efficient parallel algorithms for the maximum convex sum (MCS) problem in the later sections

Read more

Summary

Introduction

We face a challenge to process a big amount of data in the age of information explosion and big data [1]. We implement algorithms for the MSA and MCS problems based on the column-wise prefix sum on the 2D mesh architecture, as shown by Figure 2 This architecture is known as a systolic array, where each processing unit has a constant number of registers and is permitted to only communicate with directly-connected neighbours. This seemingly inflexible architecture is well suited to be implemented on ASICs or FPGAs. Our most efficient parallel algorithms complete the computation in n − 1 communication steps for the MSA problem and 4n communication steps for the MCS problem, respectively.

Parallel Algorithms for the MSA Problem
Sequential Algorithm
Parallel Algorithm 1
Parallel Algorithm 2
Parallel Algorithm 3
Review of Sequential Algorithm for the MCS Problem
Improved Sequential Algorithm
Parallel Algorithm for MCS
Computation of the Boundary
Implementation
Lower Bound
Findings
Concluding Remarks
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call