Montgomery Ladder in ECC
This is a technique to efficiently calculate the x coordinate $\bold{x}(nP)$ given the x coordinate of a point on a Montgomery curve $\bold{x}(P)$. This method is also easily implemented in a constant-time way.
This works for curves:
$$ By^2 = x^3 + Ax^2 + x $$
Satisfying $B(A^2 - 4) \neq 0$.
In Summary
The ladder itself
func ScalarMul(n Scalar, x Field) Field {
nP := Projective{1, 0}
n1P := Projective{x, 1}
swap := 0
for b := range n.msb .. n.lsb {
swap ^= b
condSwap(swap, nP, n1P)
swap = b
nP, n1P = step0(x, nP, n1P)
}
condSwap(swap, nP, n1P)
return nP.X / nP.Z
}
Implementing the individual step:
// In field elements of course
const A24 = (A - 2) / 4
func step0(x Field, nP, n1P Projective) (Projective, Projective) {
double_X := nP.X + nP.Z
double_Z := nP.X - nP.Z
// Addition
add_X := n1P.X + n1P.Z
add_Z := nP.X - nP.Z
add_X *= double_Z
add_Z *= double_X
tmp := add_X - add_Z
add_X += add_Z
add_Z = tmp
add_X.square()
add_Z.square()
add_Z *= x
// Now we can do doubling
double_X.square()
double_Z.square()
// Store for later
Xn_plus_Zn_squared := double_X
double_X *= double_Z
tmp = Xn_plus_Zn_squared - double_Z
double_Z = A24 * tmp
double_Z += Xn_plus_Zn_squared
double_Z *= tmp
return (
Projective{double_X, double_Z},
Projective{add_X, add_Z}
)
}
This could be optimized to reuse nP and n1P’s buffers better as well
Doubling Formula
Given a point $P = (x, y)$, the formula for $2P$ is given by:
$$ \begin{aligned} &x_2 = B \lambda^2 - A - 2x \cr &y_2 = \lambda (x - x_2) - y \end{aligned} $$
where:
$$ \lambda = \frac{3x^2 + 2Ax + 1}{2By} $$
(This formula is derived geometrically. $\lambda$ is the slope of the tangent line to the curve at the point $P$.)
Because $\lambda^2$ features $By^2$ in the denominator, we can replace that with $x^3 + Ax^2 + x$, and then calculate out the following identity:
$$ x_2 = \frac{(x^2 - 1)^2}{4(x^3 + Ax^2 + x)} $$
This formula calculates $\bold{x}(2P)$ given only $\bold{x}(P)$, and can be shown to be valid in other edge cases. See: [1].
Projective Form
If we use projective coordinates, then we have $x = X / Z$. Plugging this into the formula we had above, we get:
$$ x_2 = \frac{(X^2 - Z^2)^2}{4ZX(X + AXZ + Z^2)} $$
In other terms, this means we have:
$$ \begin{aligned} &X_2 = (X^2 - Z^2)^2 \cr &Z_2 = 4XZ(X + AXZ + Z^2) \end{aligned} $$
If we calculate this directly, a point doubling takes:
$$ 2 \bold{M} + 3 \bold{S} + 2 \bold{C} $$
We can re-arrange things to allow more sharing:
$$ \begin{aligned} &X_2 = (X + Z)^2(X - Z)^2 \cr &Z_2 = ((X + Z)^2 - (X - Z)^2)\left( (X + Z)^2 + \frac{A - 2}{4}((X + Z)^2 - (X - Z)^2) \right) \end{aligned} $$
If you note that:
$$ (X + Z)^2 - (X - Z)^2 = 4XZ $$
The correctness of this formula becomes relatively clear.
This adjusted formula takes:
$$ 2 \bold{M} + 2 \bold{S} + \bold{C} $$
Addition Formula
In can be shown that given $P$ and $Q$, we have:
$$ \bold{x}(P + Q)\bold{x}(Q - P) = \frac{ (\bold{x}(P)\bold{x}(Q) - 1)^2 }{ (\bold{x}(Q) - \bold{x}(P))^2 } $$
(This is a monstrous PITA, see [1].)
In projective coordinates $[X_P, Z_P],\ [X_Q, Z_Q]$, we get:
$$ \bold{x}(P + Q) \frac{X_{Q - P}}{Z_{Q - P}} = \frac{(X_P X_Q - Z_P Z_Q)^2}{(X_Q Z_P - X_P Z_Q)^2} $$
This gives us the following formulas for addition:
$$ \begin{aligned} &X_{P + Q} = Z_{Q - P}(X_P X_Q - Z_P Z_Q)^2 \cr &Z_{P + Q} = X_{Q - P}(X_Q Z_P - X_P Z_Q)^2 \end{aligned} $$
Unlike with doubling, we need to have calculated $P - Q$ prior. In practice, this constraint is easily accomodated.
This formula requires:
$$ 6 \bold{M} + 2 \bold{S} $$
To simplify it, first note that because of the properties of projective coordinates, multiplying both $X$ and $Z$ by $4$ gives us the same point:
$$ \begin{aligned} &X_{P + Q} = 4Z_{Q - P}(X_P X_Q - Z_P Z_Q)^2 \cr &Z_{P + Q} = 4X_{Q - P}(X_Q Z_P - X_P Z_Q)^2 \end{aligned} $$
A bit of toil gives us the following formula:
$$ \begin{aligned} &X_{P + Q} = Z_{Q - P}( (X_P - Z_P)(X_Q + Z_Q) + (X_P + Z_P)(X_Q - Z_Q) )^2 \cr &Z_{P + Q} = X_{Q - P}( (X_P - Z_P)(X_Q + Z_Q) - (X_P + Z_P)(X_Q - Z_Q) )^2 \cr \end{aligned} $$
This gives us an improved:
$$ 4 \bold{M} + 2 \bold{S} $$
In practice, we’ll always have $Z_{Q - P}$ as $1$, and $X_{Q - P}$ as a small constant, giving us:
$$ 2 \bold{M} + 2 \bold{S} + \bold{C} $$
Combined Formula
In practice, you have $(X_2, Z_2)$ and $(X_3, Z_3)$, and calculate $(X_4, Z_4)$ as well as $(X_5, Z_5)$ together, giving you the following combined formula:
(See: [1])
Ladder
One simple way to calculate $kP$ is as follows. Iterate over the bits of $k$ from most to least significant. If a bit $b = 0$, set:
$$ P \leftarrow 2P $$
If $b = 1$, then set:
$$ P \leftarrow 2P + P $$
Our addition formula works well when we have $P_n$ and $P_{n + 1}$, and want to calculate $P_{2n + 1}$, since the difference is $P_1 = P$, a point we already know.
At each iteration, as we rise up to $P_k$, our final value, we want to keep this property, and keep the successive values around.
This, when $b = 0$, we do:
$$ (P_n, P_{n + 1}) \mapsto (P_{2n}, P_{2n + 1}) $$
The first value comes from point doubling $P_n$, and the second from differential point addition, using the fact that the difference is $P_1$.
When $b = 1$, we want to do:
$$ (P_n, P_{n + 1}) \mapsto (P_{2n + 1}, P_{2n + 2}) $$
We can calculate the first value through point addition, and the second through point doubling.
In fact, if we define these two functions as $\text{step}_0$, and $\text{step}_1$ respectively, we notice that $\text{step}_1$ comes from swapping the input, and then the output. With:
$$ \text{swap}(P, Q) := (Q, P) $$
We have:
$$ \text{step}_1 = \text{swap};\text{step}_0;\text{swap} $$
Thus, an algorithm for doing our ladder would be:
for b in bits:
if b == 1:
swap(P, Q)
(P, Q) = step0(P, Q)
if b == 1:
swap(P, Q)
return P
We can make this “constant-time”, by replacing our branching with a conditional swap:
for b in bits:
cswap(b, P, Q)
(P, Q) = step0(P, Q)
cswap(b, P, Q)
return P
If we have two conditional swaps chained together
cswap(bits[i], P, Q)
cswap(bits[i - 1], P, Q)
Which happens between iterations, notice that this is effectively:
cswap(bits[i] ^ bits[i - 1], P, Q)
because of this, we can effectively do:
cswap(bits[N - 1], P, Q)
step0(P, Q)
cswap(bits[N - 1] ^ bits[N - 1], P, Q)
step0(P, Q)
...
cswap(bits[1] ^ bits[0], P, Q)
step0(P, Q)
cswap(bits[0], P, Q)
Noting that 0 ^ bits[N - 1] = bits[N - 1]
, we can effectively do:
swap := 0
for b in bits:
swap ^= b
cswap(swap, P, Q)
swap = b
step0(P, Q)
cswap(swap, P, Q)
return P
This allows us to efficiently compute whether or not we need to swap.
This is the gist of things, see the summary up top for more useable pseudo-code.
References
[1] Bernstein, Daniel J., Lange, Tanja - Montgomery Curves and the Montgomery Ladder (2017)