Multiple Implementations Research Articles

Zero-knowledge proof (ZKP) is an attractive cryptographic paradigm that allows a party to prove the correctness of a given statement without revealing any additional information. It offers both computation integrity and privacy, witnessing many celebrated deployments, such as computation outsourcing and cryptocurrencies. Recent general-purpose ZKP schemes, e.g., zero-knowledge succinct non-interactive argument of knowledge (zk-SNARK), suffer from time-consuming proof generation, which is mainly bottlenecked by the large-scale number theoretic transformation (NTT) and multi-scalar point multiplication (MSM). To boost its wide application, great interest has been shown in expediting the proof generation on various platforms like GPU, FPGA and ASIC.So far as we know, current works on the hardware designs for ZKP employ two separated data-paths for NTT and MSM, overlooking the potential of resource reusage. In this work, we particularly explore the feasibility and profit of implementing both NTT and MSM with a unified and high-performance hardware architecture. For the crucial operator design, we propose a dual-precision, load-balanced and fully-pipelined Montgomery multiplier (LBFP MM) by introducing the new mixed-radix technique and improving the prior quotient-decoupled strategy. Collectively, we also integrate orthogonal ideas to further enhance the performance of LBFP MM, including the customized constant multiplication, truncated LSB/MSB multiplication/addition and Karatsuba technique. On top of that, we present the unified, scalable and highperformance hardware architecture that conducts both NTT and MSM in a versatile pipelined execution mechanism, intensively sharing the common computation and memory resource. The proposed accelerator manages to overlap the on-chip memory computation with off-chip memory access, considerably reducing the overall cycle counts for NTT and MSM.We showcase the implementation of modular multiplier and overall architecture on the BLS12-381 elliptic curve for zk-SNARK. Extensive experiments are carried out under TSMC 28nm synthesis and similar simulation set, which demonstrate impressive improvements: (1) the proposed LBFP MM obtains 1.8x speed-up and 1.3x less area cost versus the state-of-the-art design; (2) the unified accelerator achieves 12.1x and 5.8x acceleration for NTT and MSM while also consumes 4.3x lower overall on-chip area overhead, when compared to the most related and advanced work PipeZK.

Read full abstract

Digital signal processing (DSP) is an engineering field involved with increasing the precision and dependability of digital communications and mathematical processes, including equalization, modulation, demodulation, compression, and decompression, which can be used to produce a signal of the highest caliber. To execute vital tasks in DSP, an essential electronic circuit such as a multiplier plays an important role, continually performing tasks such as the multiplication of two binary numbers. Multiplier is a crucial component utilized to implement a wide range of DSP tasks, including convolution, Fourier transform, discrete wavelet transforms (DWT), filtering and dithering, multimedia information processing, and more. A multiplier device includes a clock and reset buttons for more flexible operational control. Each digital signal processor constitutes a multiplier unit. A multiplier unit functions entirely autonomously from the central processing unit (CPU); consequently, the CPU is burdened with a significantly reduced amount of work. Since DSP algorithms must constantly carry out multiplication tasks, the employment of a high-speed multiplier to execute fast-speed filtering processes is vital. The previous multipliers had lots of weaknesses, such as high energy, low speed, and high area, because they implemented this necessary circuit based on traditional technology such as complementary metal-oxide semiconductor (CMOS) and very large-scale integration (VLSI). To solve all previous drawbacks in this necessary circuit, we can use nanotechnology, which directly affects the performance of the multiplier and can overcome all previous issues. One of the alternative nanotechnologies that can be used for designing digital circuits is quantum dot cellular automata, which is high speed, low area, and low power. Therefore, this manuscript suggests a quantum technology-based multiplier for DSP applications. In addition, some vital circuits, such as half adder, full adder, and ripple carry adder (RCA), are suggested for designing a multiplier. Moreover, a systolic array, accumulator, and multiply and accumulate (MAC) unit are proposed based on the quantum technology-based multiplier. Nonetheless, each of the suggested frameworks has a coplanar configuration without rotated cells. The suggested structure is developed and verified utilizing the QCADesigner 2.0.3 tools. The findings showed that all circuits have no complicated configuration, including a higher number of quantum cells, latency, and an optimum area.

Read full abstract

Multiple Implementations Research Articles

Related Topics

Articles published on Multiple Implementations

FPGA‐Based Resource‐Optimal Approximate Multiplier for Error‐Resilient Applications

A High-performance NTT/MSM Accelerator for Zero-knowledge Proof Using Load-balanced Fully-pipelined Montgomery Multiplier

Adaptive prairie dog optimization based variable length conditional counter for designing multiplier

A high speed pipelined radix-16 Booth multiplier architecture for FPGA implementation

Design and implementation of a nano-scale high-speed multiplier for signal processing applications

Design and Implementation of a Wide-Swing CMOS Multiplier for AC Source Signal Tracking and Modulation

16-Bit multiplier optimization based on Wallace tree and Booth algorithm

ЭКОНОМИЧЕСКАЯ СУЩНОСТЬ МУЛЬТИГОЛОСУЮЩИХ АКЦИЙ И ВОЗМОЖНОСТЬ ИХ ИМПЛЕМЕНТАЦИИ НА РОССИЙСКОМ ФИНАНСОВОМ РЫНКЕ

Design and implementation of wallace tree multiplier and its applications in FIR filter

High-Performance Implementation of a 1024-bit Full-Word Montgomery Modular Multiplier for High-Speed Encryption Hardware Circuit

Speed, Power and Area Optimized Monotonic Asynchronous Array Multipliers

A novel reversible gate and optimised implementation of half adder, subtractor and 2-bit multiplier

High-performance unified modular multiplication algorithm and hardware architecture over G(2m)

Design and Implementation of Hybrid Multiplier for DSP Applications

Design of integrated voltage multipliers using standard CMOS technologies

Design and Implementation of Area Efficient Low Latency Radix-8 Multiplier on FPGA

Implementation of Efficient Vedic Multiplier and Its Performance Evaluation

High-Throughput Polynomial Multiplier for Accelerating Saber on FPGA

Synthesis of Modular Multipliers

Design of Modified Booth’s Encoder Using SPST technique

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Multiple Implementations Research Articles

Related Topics

Articles published on Multiple Implementations

FPGA‐Based Resource‐Optimal Approximate Multiplier for Error‐Resilient Applications

A High-performance NTT/MSM Accelerator for Zero-knowledge Proof Using Load-balanced Fully-pipelined Montgomery Multiplier

Adaptive prairie dog optimization based variable length conditional counter for designing multiplier

A high speed pipelined radix-16 Booth multiplier architecture for FPGA implementation

Design and implementation of a nano-scale high-speed multiplier for signal processing applications

Design and Implementation of a Wide-Swing CMOS Multiplier for AC Source Signal Tracking and Modulation

16-Bit multiplier optimization based on Wallace tree and Booth algorithm

ЭКОНОМИЧЕСКАЯ СУЩНОСТЬ МУЛЬТИГОЛОСУЮЩИХ АКЦИЙ И ВОЗМОЖНОСТЬ ИХ ИМПЛЕМЕНТАЦИИ НА РОССИЙСКОМ ФИНАНСОВОМ РЫНКЕ

Design and implementation of wallace tree multiplier and its applications in FIR filter

High-Performance Implementation of a 1024-bit Full-Word Montgomery Modular Multiplier for High-Speed Encryption Hardware Circuit

Speed, Power and Area Optimized Monotonic Asynchronous Array Multipliers

A novel reversible gate and optimised implementation of half adder, subtractor and 2-bit multiplier

High-performance unified modular multiplication algorithm and hardware architecture over G(2m)

Design and Implementation of Hybrid Multiplier for DSP Applications

Design of integrated voltage multipliers using standard CMOS technologies

Design and Implementation of Area Efficient Low Latency Radix-8 Multiplier on FPGA

Implementation of Efficient Vedic Multiplier and Its Performance Evaluation

High-Throughput Polynomial Multiplier for Accelerating Saber on FPGA

Synthesis of Modular Multipliers

Design of Modified Booth’s Encoder Using SPST technique