Abstract
We present efficient sequential and parallel algorithms for the maximum sum (MS) problem, which is to maximize the sum of some shape in the data array. We deal with two MS problems; the maximum subarray (MSA) problem and the maximum convex sum (MCS) problem. In the MSA problem, we find a rectangular part within the given data array that maximizes the sum in it. The MCS problem is to find a convex shape rather than a rectangular shape that maximizes the sum. Thus, MCS is a generalization of MSA. For the MSA problem, O ( n ) time parallel algorithms are already known on an ( n , n ) 2D array of processors. We improve the communication steps from 2 n − 1 to n, which is optimal. For the MCS problem, we achieve the asymptotic time bound of O ( n ) on an ( n , n ) 2D array of processors. We provide rigorous proofs for the correctness of our parallel algorithm based on Hoare logic and also provide some experimental results of our algorithm that are gathered from the Blue Gene/P super computer. Furthermore, we briefly describe how to compute the actual shape of the maximum convex sum.
Highlights
We face a challenge to process a big amount of data in the age of information explosion and big data [1]
We provide rigorous proofs for the correctness of our parallel algorithm based on Hoare logic and provide some experimental results of our algorithm that are gathered from the Blue Gene/P super computer
We show how the programming techniques lead to efficient parallel algorithms for the maximum convex sum (MCS) problem in the later sections
Summary
We face a challenge to process a big amount of data in the age of information explosion and big data [1]. We implement algorithms for the MSA and MCS problems based on the column-wise prefix sum on the 2D mesh architecture, as shown by Figure 2 This architecture is known as a systolic array, where each processing unit has a constant number of registers and is permitted to only communicate with directly-connected neighbours. This seemingly inflexible architecture is well suited to be implemented on ASICs or FPGAs. Our most efficient parallel algorithms complete the computation in n − 1 communication steps for the MSA problem and 4n communication steps for the MCS problem, respectively.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have