Abstract
Sorting is an important component of many applications, and parallel sorting algorithms have been studied extensively in the last three decades. One of the earliest parallel sorting algorithms is bitonic sort, which is represented by a sorting network consisting of multiple butterfly stages. The paper studies bitonic sort on modern parallel machines which are relatively coarse grained and consist of only a modest number of nodes, thus requiring the mapping of many data elements to each processor. Under such a setting optimizing the bitonic sort algorithm becomes a question of mapping the data elements to processing nodes (data layout) such that communication is minimized. The authors developed a bitonic sort algorithm which minimizes the number of communication steps and optimizes the local computation. The resulting algorithm is faster than previous implementations, as experimental results collected on a 64 node Meiko CS-2 show.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.