The authors proposed hardware accelerator for the proof-of-work (PoW) operation in IOTA cryptocurrency. IOTA allows making secure and authenticated quantum resistant channels between IoT devices for communication and micropayments with no fees. Hardware acceleration reduces transaction time and increases throughput. In the article authors consider a basic theory of IOTA operations, generating and signing transaction with Winternitz one-time signatures. The authors describe operation principle of a new ternary hash function Curl. Winternitz one-time signatures and ternary hash function make IOTA quantum resistant. The core of IOTA is called Tangle and unlike Blockchain has Directed Acyclic Graph structure. There are no miners in the Tangle and IoT devices themselves maintain network operation, which leads to unlimited scalability and absence of fees. To add new transaction to the Tangle, IoT devices need to perform PoW operation for spam and Sybil attack protection — iteratively calculate Curl hash function for the IOTA transaction and change nonce field of the transaction until obtained result doesn’t satisfy given criteria, which is some amount of consecutive zero ternary values at the end of transaction hash. The software implementation of Curl hash function is very slow, the PoW operation on embedded devices can last up to 50 minutes, so the hardware acceleration of PoW operation is relevant task. In the proposed work authors created hardware accelerator for IOTA PoW operation. The structure and operation principle of accelerator is described. The proof-of-concept implementations was launched on DE10-nano board, based on Intel programmable logic chip. The proposed PoW hardware accelerator has parameterizable structure. It is possible manually set the number of PoW computing units by changing parameter value. In such parameterizable system one PoW computing unit is master and all remaining PoW units are slaves. Master PoW unit absorbs IOTA transaction, except nonce part, to midstate register utilizing sponge-like approach. Then all POW computing units (master and slaves) preload own state registers from midstate, randomly change personal nonces and start iterative search of valid nonce. When one of PoW computing units finds a valid nonce, PoW operation ends, nonce stored to destination buffer in SDRAM and interrupt is generated for ARM CPU. Final implementation of IOTA PoW hardware accelerator for DE10-nano board contains 11 PoW computing units, delivers 13.2 MH/s hash rate and gives x1000 speedup, compared to software implementation from IOTA developers, for only 30% of 5CSEBA6U23I7 programmable logic chip resources at 100 MHz clock frequency. The average PoW computation time in such implementation is 0.8 second.Ref. 14, img. 6.
Read full abstract