Abstract

A multicomputer is a loosely coupled multiprocessor architecture, which is also called a cluster computer or just a cluster. It may be assembled from commercially available PCs and networking hardware as a low-cost alternative to rack mounted blades, interconnected through a high speed backplane, or to a high performance, tightly coupled, shared memory multiprocessor. Parallel software applications exploit parallelism in a cluster by distributing chunks of data among its remote nodes for simultaneous processing.Each remote node processes its chunk of data in parallel with the others and returns its partial result back to the master node for combination into a final result. When the chunk size, or system granularity, is just right, there is a balance between the network-transit time and the remote-processing time of the data; the distributed application runs at maximum efficiency and the system granularity is said to be optimal. That said, one wonders why there are no tools explicitly designed to help parallel software designers and performance analysts determine the optimal granularity size of a given cluster!We present a back-to-basics approach, application-level program named McGAT, for Multi-computer Granularity Assessment Tool. The user specifies a range of chunk sizes to analyze, along with a remote node's processing time estimate for a particular sized chunk. McGAT is not a simulation, but a parallel application that runs on the user's hardware, distributing chunks of data to each of the cluster's remote nodes. Each node responds after an amount of time necessary to process that data, based on the user's processing time estimates. McGAT reports system throughput for each chunk size over the user-specified range.The chunk size with the maximum throughput indicates the optimal granularity of the system on which it is run, which is a key parallel software design component. The user may then choose a chunk size that provides maximum throughput within the requirements of his or her specific application.Our presentation includes graphs comparing the results from runs on a 3-, 4-, and 5-node Linux-based PC cluster, interconnected through both 10 and 100 Mbps Ethernet. We plan to expand our knowledge base to include larger numbers of nodes, and a virtual ring to complement or virtual star communications protocol.McGAT may be used to establish a performance baseline, and again later to quantify the effect of any subsequent changes to the cluster. McGAT is offered as open source software, and a beta release for Linux based platforms is available for download from the DSRLab website under the GNU General Public License.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call