This paper proposes a high performance Vector processor based on the high performance Embedded Core which is named TS800. The TS800 is a 4-core processor based on RISC-V architecture, implements IMAFDV instruction set, supports L2 Cache, branch prediction, sequential pipeline, and dual-issue structure. The traditional CPU mainly supports Scalar calculations, or only supports Vector calculations. For applications such as image and signal processing, there are a large number of data parallel computing operations. To solve the problem of low performance of parallel data calculations in industrial power applications, it is proposed to add VPU hardware implementation in the TS800. The TS800 can support FFT algorithm, adaptive controllers Reinforcement learning and learning-based underlying algorithm requirements. In this paper, the module and data flow between each processing unit and the control circuit, that is, the hardware realization of VPU module are proposed. Large-area units such as float arithmetic, multiplication and division are multiplexed with the Scalar operator in the CPU, while the control circuit is placed in the VPU-ALU, and the area is small. Units such as arithmetic and logic operation instructions, shift operation instructions, comparison operation instructions, and permutation instruction are implemented through the VPU-ALU, which makes the overall design area smaller and the performance better. At the same time, through the fir, fft, conv, matrix, Signal Converge and variance test, it is proved that while executing the same program, the running time of the cpu only with Scalar is 1.44 to 9.55 times that of the CPU with Vector module, which can support the underlying algorithm of the adaptive controller.
Read full abstract