Abstract

Special accelerator architecture has achieved great success in processor architecture, and it is trending in computer architecture development. However, as the memory access pattern of an accelerator is relatively complicated, the memory access performance is relatively poor, limiting the overall performance improvement of hardware accelerators. Moreover, memory controllers for hardware accelerators have been scarcely researched. We consider that a special accelerator memory controller is essential for improving the memory access performance. To this end, we propose a dynamic random access memory (DRAM) memory controller called NNAMC for neural network accelerators, which monitors the memory access stream of an accelerator and transfers it to the optimal address mapping scheme bank based on the memory access characteristics. NNAMC includes a stream access prediction unit (SAPU) that analyzes the type of data stream accessed by the accelerator via hardware, and designs the address mapping for different banks using a bank partitioning model (BPM). The image mapping method and hardware architecture were analyzed in a practical neural network accelerator. In the experiment, NNAMC achieved significantly lower access latency of the hardware accelerator than the competing address mapping schemes, increased the row buffer hit ratio by 13.68% on average (up to 26.17%), reduced the system access latency by 26.3% on average (up to 37.68%), and lowered the hardware cost. In addition, we also confirmed that NNAMC efficiently adapted to different network parameters.

Highlights

  • In modern computer architectures, the main memory for hardware acceleration is dynamic random-access memory (DRAM), which is advantaged by high density and low cost

  • Each calculation result Dif_addr is stored in the signed address difference memory Dif_ram. This memory is composed of multiple address storage units, and its depth depends on the local parameter kernel of the neural network, which is found in the parameter reference table (PRT) entry table

  • In the 27 test cases, NNAMC was compared with the other address mappings (BRC, row–bank– column (RBC), BPBI, Bit reversal and Minimalist openpage (MinOP)), the row cache hit rate increased by 43%, 12.46%, 17.90%, 11.32%, 11.36%, respectively

Read more

Summary

Introduction

The main memory for hardware acceleration is dynamic random-access memory (DRAM), which is advantaged by high density and low cost. Memory performance is usually optimized through memory address mapping [1,2,3,4], memory access scheduling strategies [5,6,7], rearrangement of access data [8,9], and other methods that reduce the row conflicts. NNAMC operates at the whole hardware level to improve the memory system performance During execution, it divides the memory banks and applies different address mapping schemes to different banks, isolating the different access stream patterns. The memory access address sequence is completed by the data prediction unit of the previous item, and designs the optimal address mapping scheme in the partitioned bank.

Background
Special Memory Controller
Bank Partitioning
Motivation
Address and Pixel Transaction
Motivation—Memory Access of CNN Accelerator
Experimental Platform
Hardware Resource Utilization
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call