Outperforming Sequential Full-Word Long Addition With Parallelization and Vectorization

Andrey Chusov

doi:10.1109/tpds.2022.3211937

Abstract

The article presents algorithms for parallel and vectorized full-word addition of big unsigned integers with carry propagation. Because of the propagation, software parallelization and vectorization of non-polynomial addition of big integers have long been considered impractical due to data dependencies between digits of the operands. The presented algorithms are based upon parallel and vectorized detection of carry origins within elements of vector operands, masking bits which correspond to those elements and subsequent scalar addition of the resulting integers. The acquired bits can consequently be taken into account to adjust the sum using the proposed generalization of the Kogge-Stone method. Essentially, the article formalizes and experimentally verifies parallel and vectorized implementation of carry-lookahead adders applied at arbitrary granularity of data. This approach is noticeably beneficial for manycore, CUDA and vectorized implementation using AVX-512 with masked instructions. Experiments show that the parallel and vectorized implementations of the proposed algorithms can be multiple times faster compared to a sequential ripple-carry adder or adders based on redundant number systems such as one used in the GNU Multiple Precision library.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Outperforming Sequential Full-Word Long Addition With Parallelization and Vectorization

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems

Lead the way for us

Journal: IEEE Transactions on Parallel and Distributed Systems	Publication Date: Dec 1, 2022
License type: cc-by

Similar Papers

Teaching redundant residue number system for electronics and computer students
Somayeh Timarchi
International Journal of Mathematical Education in Science and Technology | VOL. 54
Somayeh TimarchiSomayeh Timarchi
11 Aug 2022
International Journal of Mathematical Education in Science and Technology | VOL. 54

Redundant Radix-2r Number System for Accelerating Arithmetic Operations on the FPGAs
Kensuke Kawakami ... Koji Shigemoto
-
Kensuke Kawakami, et. al.Kensuke Kawakami ... Koji Shigemoto
01 Jan 2008
01 Jan 2008

Differential Fault Attacks and Countermeasures in Elliptic Curve Cryptography
Anissa Sghaier ... Mohsen Machhout
International Journal of Computer Applications | VOL. 140
Anissa Sghaier, et. al.Anissa Sghaier ... Mohsen Machhout
15 Apr 2016
International Journal of Computer Applications | VOL. 140

RISC-V3: A RISC-V Compatible CPU With a Data Path Based on Redundant Number Systems
Marc Reichenbach ... Sebastian Rachuj
IEEE Access | VOL. 9
Marc Reichenbach, et. al.Marc Reichenbach ... Sebastian Rachuj
01 Jan 2020
IEEE Access | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Outperforming Sequential Full-Word Long Addition With Parallelization and Vectorization

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems