Abstract

This article discusses an architecture based on the paradigm of using all possible processes parallelism. The user should only specify which calculations can be performed in parallel threads over shared memory, conforming only to the selected algorithm. This allows you to create the maximum flow of memory accesses inherent in the algorithm. If necessary, read, and only then write a new value instead to the corresponding shared memory cell, the user believes that the conflict resolution mechanism is implemented by hardware memory access control. In general, the proposed architecture is aimed at solving the same problems as the EMU and PIUMA architectures, but uses "smart" controllers of shared memory blocks to synchronize threads and implement atomic operations. For a large flow of accesses to distributed shared memory, energy-efficient routing is necessary. This paper proposes arithmetic routing, which is applicable in any communication fabrics, including with graphs of Dragonfly and graphs with the minimum possible length of the middle path and with the same number of vertices N and degrees of vertices v. An addressing and routing algorithm is proposed that provides energy-efficient access to distributed shared memory. Routing enables fault-tolerant operation based on the choice of alternative routes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call