Abstract
The convolutional and deep neural networks are prevalent machine learning algorithms for real-world applications. As the neural network needs large computations, many artificial intelligence (AI) chips are designed to accelerate the computation. AI chips have achieved better energy efficiency and high computational capacity in the neural network implementation. The communication network in AI chips influences the data transformation and hardware efficiency. The network-on-chip (NoC) is one feasible solution to meet the data communication requirements in AI chips. This paper introduces the communication network in AI chips and the strategy of mapping neural network to chips with the extensible hierarchical architecture. We also conclude the opportunities for communication optimization in the design of AI chips. In this paper, we propose our processor architecture and optimize the performance and energy of intra-communication in chips from three aspects: data reuse, topology, and router architecture. The experimental results show that our optimization can totally achieve <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$25.31\times $ </tex-math></inline-formula> latency reduction and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$79.92\times $ </tex-math></inline-formula> energy less than the baseline. The results show that our design can reduce the latency by <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$5.47\times $ </tex-math></inline-formula> and save communication energy by <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$7.5\times $ </tex-math></inline-formula> when compared with the state-of-the-art design DaDianNao. When compared with another design Eyeriss, our design can reduce latency by <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$7.57\times $ </tex-math></inline-formula> and save communication energy by <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$3.03\times $ </tex-math></inline-formula> .
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have