Abstract

In order to solve the problem of low efficiency of hardware resources and low data processing ability of vector processors, this paper uses data compression and vectorization method to realize matrix multiplication based on the HXDSP platform with the DCT algorithm in HEVC. It can make full use of the hardware resources of DSP to achieve the optimal optimization. The experimental results show that this method can achieve 32GMACS which is the peak-point multiply-accumulate capability of HXDSP. It can achieve to 2Gpixel/s for the data processing capability, which meets the performance requirements of HEVC coding standard and provides a reference for hardware implementation of HEVC.

Highlights

  • In the field of digital signal processing, matrix multiplication is an important operation

  • In order to achieve highest computing performance on different hardware platforms, matrix multiplication requires a combination of the hardware platform architecture and its hardware resources, etc

  • Because the vector processing unit in HXDSP1041 contains a set of multiplyaccumulate special registers which can replace adders, matrix multiplication can be realized only using multipliers

Read more

Summary

INTRODUCTION

In the field of digital signal processing, matrix multiplication is an important operation. In order to achieve highest computing performance on different hardware platforms, matrix multiplication requires a combination of the hardware platform architecture and its hardware resources, etc. In order to improve the matrix computing ability of DSP, the 38th Institute of China Electronics Technology Group has designed a vector processor-HXDSP1041 by adjusting the architecture of BWDSP100. Combining HXDSP1041 features of software and hardware, this paper adopts data compression and vectorization of matrix multiplication method which can make fully use the computing resources and data storage resources in HXDSP1041 and improve the number of data processing and data processing speed. According to the experimental results of the latest video coding standard HEVC on DSP, this paper verifies the effectiveness of proposed method on the HXDSP1041 with the implementation of DCT algorithm in HEVC

HXDSP AND ITS FEATURES
Instructions parallelism
Loop Unrolling
Software pipelining
Memory accessing of modulo 16
MATRIX MULTIPLICATION
Computational complexity analysis of DCT in HEVC
Realization of Data Compression and Vector Matrix Multiplication
5.EXPERIMENTAL RESULTS AND PERFORMANCE ANALYSIS
6.CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call