Abstract

Applications on MPPs often require a high aggregate bandwidth of low-latency to secondary storage. This requirement can met by internal parallel subsystems that comprise dedicated nodes, each with processor, memory, and disks.Massively parallel processors (MPPs), encompassing from tens to thousands of processors, are emerging as a major architecture for high-performance computers. Most major computer vendors offer computers with some degree of parallelism, and many smaller vendors specialize in producing MPPs. These machines are targeted for both grand-challenge problems and general-purpose computing.Like any computer, MPP architectural design must balance computation, memory bandwidth and capacity, communication capabilities, and I/O. In the past, most design research focused on the basic compute and communications hardware and software. This led to unbalanced computers that had relatively poor performance. Recently, researchers have focused on designing hardware and software for subsystems in MPPs. Consequently, most current MPPs have an architecture based on an internal parallel subsystem (the Architectures with parallel I/O sidebar describes some examples). In these computers, this subsystem encompasses a collection of nodes, each managing and providing access to a set of disks. The nodes connect to other nodes in the system by the same switching network that connects the compute nodes.In this article we'll examine why many MPPs use parallel subsystems, what architecture is best for such a subsystem, and how to implement the subsystem. We'll also discuss how parallel file systems and their user interfaces can exploit the parallel to provide enhanced services to applications.The systems discussed in this article are mostly tightly coupled distributed-memory MIMD (multiple-instruction, multiple-data) MPPs. In some cases, we also discuss shared-memory and SIMD (single-instruction, multiple-data) machines. We'll discuss three node types. Compute nodes are optimized to perform floating-point and numeric calculations, and have no local disk except perhaps for paging, booting, and operating-system software. nodes contain the system's secondary storage, and provide the parallel file-system services. Gateway nodes provide connectivity to external data servers and mass-storage systems. In some cases, individual nodes can serve as more than one type. For example, the same nodes often handle and gateway functions. The Terminology sidebar defines some other terms used in this article.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call