The Montgomery ladder and Joye ladder are well-known algorithms for elliptic curve scalar multiplication with a regular structure. The Montgomery ladder is best known for its implementation on Montgomery curves, which requires 5M+4S+1m+8A per scalar bit, and 6 field registers. Here (M, S,m,A) represent respectively field Multiplications, Squarings, multiplications by a curve constant, and Additions or subtractions. This ladder is also complete, meaning that it works on all input points and all scalars. Many protocols do not use Montgomery curves, but instead use prime-order curves in short Weierstrass form. These have historically been much slower, with ladders costing at least 14 multiplications or squarings per bit: 8M + 6S + 27A for the Montgomery ladder and 8M+ 6S + 30A for the Joye ladder. In 2017, Kim et al. improved the Montgomery ladder to 8M+ 4S + 12A + 1H per bit using 9 registers, where the H represents a halving. Hamburg simplified Kim et al.’s formulas to 8M+ 4S + 8A + 1H per bit using 6 registers. Here we present improved formulas which compute the Montgomery ladder on short Weierstrass curves using 8M+ 3S + 7A per bit, and requiring 6 registers. We also give formulas for the Joye ladder that use 9M+3S+7A per bit, requiring 5 registers. One of our new formulas supports very efficient 4-way vectorization. We also discuss curve invariants, exceptional points, side-channel protection and how to set up and finish these ladder operations. Finally, we show a novel technique to make these ladders complete when the curve order is not divisible by 2 or 3, at a modest increase in cost. A sample implementation of these techniques is given in the supplementary material, also posted at https://github.com/bitwiseshiftleft/ladder_formulas
Read full abstract