Abstract

This letter presents a graphics processing unit (GPU)-based non-binary low density parity check multi-codeword decoder with both kernel execution and data transfer optimizations. A novel multi-codeword data structure and its corresponding parallelism are proposed to boost the compute unified device architecture kernel execution. Moreover, practical methods of hiding the data transfer latency are presented to improve data transfer efficiency. Experimental results demonstrate that the throughput speedups achieved by the proposed decoder range from $3.12 \times$ to $185 \times$ over various Galois fields compared with the existing works on GPU.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call