Abstract

This paper proposes new processor architecture for accelerating data-parallel applications based on the combination of VLIW and vector processing paradigms. It uses VLIW architecture for processing multiple independent scalar instructions concurrently on parallel execution units. Data parallelism is expressed by vector ISA and processed on the same parallel execution units of the VLIW architecture. The proposed processor, which is called VLIW, has unified register file of 64x32-bit registers in the decode stage for storing scalar/vector data. VLIW can issue up to four scalar/vector operations in each cycle for parallel processing a set of operands and producing up to four results. However, it cannot issue more than one memory operation at a time, which loads/stores 128-bit scalar/vector data from/to data cache.. The complete design of our proposed VLIW processor is implemented using Verilog. our proposed VLIW processor is implemented using Verilog targeting the Xilinx FPGA Virtex-5, XC5VLX110T-3FF1136 device. The required numbers of slice registers and LUTs are 20292 and 24214 out of 28800 respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call