Interconnection networks form the backbone of scalable computer systems, including high-performance computers and datacenters. High-performance networks must provide high throughput and bandwidth while delivering messages with minimal latency. As processors that connect to the network inject more messages, the network load rises and can cause two potential deleterious conditions: network deadlock—where messages are blocked in the network and cannot progress—and an exponential increase in end-to-end latency, as messages must wait in the network for prior messages to be delivered. The network architect’s job is to deliver high bandwidth and low latency for the largest network load possible, but with limited cost and power. Before the 1990s, scalable computing systems were typically built with two separate networks to avoid deadlock. These networks were often assigned to “request” and “reply” traffic classes, and messages in separate classes did not interfere as they progressed through the network. The networks had their own individual wires to ensure that reply messages did not get stuck behind request messages. Unfortunately, this approach increases network cost by doubling the number of wires, and can leave links idle as congestion rises. In “Virtual Channel Flow Control” (IEEE Trans. Parallel and Distributed Systems, vol. 3, no. 2, 1992, pp. 194–205), William J. Dally proposed an alternate approach to alleviate deadlock and provide better network performance. Virtual channels include two or more virtualized networks that share the physical wire links and routers, but have their own buffer storage for in-flight messages. In modern interconnection networks, messages are decomposed into flow-control digits (flits) that transit the network like train cars. When a message reaches a congested point, its flits reside in the virtual channel buffers, leaving the links free for messages belonging to other virtual channels. The decoupling of buffering from links in virtual channels enables deadlock-free routing algorithms and the passing of blocked messages in the network, using otherwise idle link resources. With virtual channels, a system doesn’t incur the link cost of multiple networks and can more easily multiplex traffic from different flows over a single set of physical links.
Read full abstract