Quicksort is a well-known algorithm used to solve scientific problems. It divides an input array into smaller subarrays and sorts them. However, this algorithm sorts the input data using only one thread. Thus, the algorithm can be implemented to execute in parallel to improve the sorting performance. The study developed a parallel sorting algorithm that uses a block-based concept to partition data and optimized this parallel sorting algorithm using the OpenMP scheduling type. In this study, the parallel sorting algorithm termed the parallel deque-free partition dual-deque merge sorting algorithm is proposed and executed on shared memory architecture. Deque-free partitioning (DFPar) is a parallel block-based partitioning algorithm that partitions an array in blocks without addition operations, and dual-deque merge is a merging algorithm that merges blocks of data into the correct position and sorting steps. This parallel sorting algorithm is implemented using an application programing interface that supports parallel programing on shared memory architecture named OpenMP to improve performance. In this experiment, an Intel Core i7 machine running the Ubuntu Linux operating system was used. Results show that DFPDMSort is faster than benchmarks such as the multi-way merge sort algorithm on a large data size. The performance of the Deque-Free Partitioning step is optimized by the OpenMP scheduling type and its chunk size. Moreover, the parameters of this algorithm, such as block size and cutoff size, can be used as guidelines to improve the performance of other parallel block-based algorithms. These parameters affect cache and branch load misses of the algorithm.
Read full abstract