Abstract
Turbo codes comprising a parallel concatenation of upper and lower convolutional codes are widely employed in the state-of-the-art wireless communication standards, since they facilitate transmission throughputs that closely approach the channel capacity. However, this necessitates high processing throughputs in order for the turbo code to support real-time communications. In the state-of-the-art turbo code implementations, the processing throughput is typically limited by the data dependences that occur within the forward and backward recursions of the Log-BCJR algorithm, which is employed during turbo decoding. In contrast to the highly serial Log-BCJR turbo decoder, we have recently proposed a novel fully parallel turbo decoder (FPTD) algorithm, which can eliminate the data dependences and perform fully parallel processing. In this paper, we propose an optimized FPTD algorithm, which reformulates the operation of the FPTD algorithm so that the upper and lower decoders have identical operation, in order to support single instruction multiple data operation. This allows us to develop a novel general purpose graphics processing unit (GPGPU) implementation of the FPTD, which has application in software-defined radios and virtualized cloud-radio access networks. As a benefit of its higher degree of parallelism, we show that our FPTD improves the higher processing throughput of the Log-BCJR turbo decoder by between 2.3 and 9.2 times, when employing a high-specification GPGPU. However, this is achieved at the cost of a moderate increase of the overall complexity by between 1.7 and 3.3 times.
Highlights
Channel coding has become an essential component in wireless communications, since it is capable of correcting the transmission errors that occur when communicating over noisy channels
1) We propose a beneficial enhancement of the Fully Parallel Turbo Decoder (FPTD) algorithm of [22] so that it supports Single Instruction Multiple Data (SIMD) operation and it becoming better suited for implementation on a General Purpose Graphics Processing Unit (GPGPU)
4) We show that when used for implementing the LTE turbo decoder, the proposed SIMD FPTD achieves a degree of parallelism that is between 4 and 24 times higher, representing a processing throughput improvement between 2.3 to 9.2 times as well as a latency reduction between 2 to 8.2 times
Summary
Channel coding has become an essential component in wireless communications, since it is capable of correcting the transmission errors that occur when communicating over noisy channels. We previously proposed a Fully-Parallel Turbo Decoder (FPTD) algorithm [22], which dispenses with the serial data dependencies of the conventional Log-BCJR turbo decoder algorithm This enables every bit in a frame to be processed concurrently, achieving a much higher degree of parallelism than the previously demonstrated in the literature. 4) We show that when used for implementing the LTE turbo decoder, the proposed SIMD FPTD achieves a degree of parallelism that is between 4 and 24 times higher, representing a processing throughput improvement between 2.3 to 9.2 times as well as a latency reduction between 2 to 8.2 times This is achieved at the cost of increasing the overall complexity by a factor between 1.7 and 3.3.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.