Abstract

Cyclic Redundancy Check (CRC) is one of the most important error-detecting codes used in digital communication and storage systems. A number of algorithms based on parallel table lookups, which can process 32 or 64 bits at a time, have been proposed to accelerate CRC generation process. In recent years, parallel CRC algorithms with multi-processor architecture become popular, which further increase the processing speed. However, there is still much room to reduce the synchronization and recombination overheads for large messages to be processed. In this paper, we propose a coarse-grained parallel CRC algorithm for efficient n-core processor implementation. Our algorithm can be easily combined with existing fine-grained CRC methods to obtain a high performance. The synchronization and recombination in the proposed algorithm are deferred and needed only once to minimize the overhead cost. The evaluation results demonstrate that the proposed algorithm can achieve a speedup of a factor of almost n.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call