Abstract

External sorting is a frequent operation in relational database systems, sometimes as a step in important operations such as joins. Therefore, external sorting on a parallel system is a key index of system performance for database applications. However, the problem of external sorting on multicomputers is not as well understood as parallel internal sorting, when keys reside in main memory. In many cases, analysis is performed under assumptions such as unlimited resources (number of processors, amount of memory, network bandwidth) and full overlapped use of resources, limiting its applicability in practice. External sorting typically involves the creation of multiple sorted runs (Step 1) followed by merging of the sorted runs (Step 2). In this paper, we present an analytical model for Step 1 using pipelined sort on a message-passing multicomputer with shared disks. This model includes parameters representing system configuration, performance of system components, software-related choices, and problem size. The execution time predicted by the model is compared with experimental results on a transputer-based system reported recently by the authors [5]. Based on the model, impact of system scale-up and faster components is investigated. The model is general enough for use in benchmarking other message-based machines.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.