Abstract
Network-on-Chip (NoC) interconnect fabrics are categorized according to trade-offs among latency, throughput, speed, and silicon area, and the correctness and performance of these fabrics in Field-Programmable Gate Array (FPGA) applications are assessed through experimentation and simulation. In this paper, we propose a consistent parametric method for evaluating the FPGA performance of three common on-chip interconnect architectures namely, the Mesh, Torus and Fat-tree architectures. We also investigate how NoC architectures are affected by interconnect and routing parameters, and demonstrate their flexibility and performance through FPGA synthesis and testing of 392 different NoC configurations. In this process, we found that the Flit Data Width (FDW) and Flit Buffer Depth (FBD) parameters have the heaviest impact on FPGA resources, and that these parameters, along with the number of Virtual Channels (VCs), significantly affect reassembly buffering and routing and logic requirements at NoC endpoints. Applying our evaluation technique to a detailed and flexible cycle accurate simulation, we drive the three NoC architectures using benign (Nearest Neighbor and Uniform) and adversarial (Tornado and Random Permutation) traffic patterns with different numbers of VCs, producing a set of load–delay curves. The results show that by strategically tuning the router and interconnect parameters, the Fat-tree network produces the best utilization of FPGA resources in terms of silicon area, clock frequency, critical path delays, network cost, saturation throughput, and latency, whereas the Mesh and Torus networks showed comparatively high resource costs and poor performance under adversarial traffic patterns. From our findings it is clear that the Fat-tree network proved to be more efficient in terms of FPGA resource utilization and is compliant with the current Xilinx FPGA devices. This approach will assist engineers and architects in establishing an early decision in the choice of right interconnects and router parameters for large and complex NoCs. We demonstrate that our approach substantially improves performance under a large variety of experimentation and simulation which confirm its suitability for real systems.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.