Abstract

Reconfigurable units such as Field Programmable Gate Arrays (FPGAs) have been used widely as high-performance hardware accelerators for a variety of applications and offer a promising solution to the power bottleneck of current multi-core processors. One emerging trend in hardware acceleration is to build a specialized network for accelerators to support inter-accelerator communications with the goal of increasing acceleration throughput by sharing functions and avoiding frequent reconfiguration. On accelerator networks, data packets are routed in a content-based instead of an address-based manner in that the destinations are determined by the acceleration task—any nodes that support the current acceleration step could be a receiver. Designing an effective and efficient routing algorithm that supports content-based routing becomes an important problem. Adopting shortest path routing for each acceleration task is a standard approach. While Breadth-First Search (BFS) algorithm can be applied, a naive implementation is computationally unaffordable. We propose a new routing algorithm called the Shortest Cycle Routing Algorithm and address the computation inefficiency of standard BFS. We design a branch-and-bound method to effectively prune the search tree without compromising path optimality. The time and space complexities for searching the shortest cycles of an acceleration task of k steps improve from Onk to O(1) where n is the number of accelerators in the network. We analyze locality and path diversity properties of shortest cycle routing and show how they can be used to develop adaptive routing strategy, restrict global flooding to local neighborhood and reduce the number of path cycles.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call