Abstract

The Communication Machine brings to the multicomputer what vectorization brought to the uniprocessor. It provides the same tools to speed communication that have traditionally been used to speed computation; namely, the capability to program optimal communication algorithms on an architecture that can, to the extent possible, replicate their performance in terms of wall-clock time. In addition to the usual complement of logic and arithmetic units, each module contains a programmable communication unit that orchestrates traffic between the network and registers that communicate directly with comparable registers in neighboring modules. Communication tasks are performed out of these registers like computational tasks on a vector uniprocessor. The architecture is balanced in the sense that, on average, the speed of local and global memory is comparable. Theoretical performance is tabulated for both hypercube and mesh interconnection networks. The Communication Machine returns to the somewhat beleaguered, yet intuitive concept that the performance we ultimately seek must come from a truly massive number of processors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call