Abstract

In this paper, we propose an implementation of large integer multiplication using Single Instruction Multiple Data (SIMD) instructions. We evaluated the implementation on an Intel Xeon Phi processor. The second generation Intel Xeon Phi processor, Knights Landing, has a set of Advanced Vector Extensions-512 (AVX-512) instructions. Using AVX-512, the processor can handle 512 bits at the same time and has the potential to multiply faster than a processor using Streaming SIMD Extensions (SSE) and AVX. Therefore, we applied AVX-512F (foundation) instructions to the program. In the multiplication of large integers, as the number of digits increases, various processing costs also become larger. One of these costs is carry processing. Therefore, we implemented a multiplication function using a reduced-radix representation and compared the execution time and the number of instructions against the GNU Multiple Precision Arithmetic Library (GMP). Furthermore, we used some optimization techniques for this kernel. We successfully achieved an execution time that was approximately 2.5x faster than GMP on the Knights Landing architecture.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call