Henry de Valence

2017-03-25 19:43:52 UTC

Ed25519 performs the following operations on private scalars immediately

scalar[0] &= 248;

scalar[31] &= 63;

scalar[31] |= 64;

I've heard this referred to as "clamping" although that may not be the best

term.

These operations are not applied to the canonical scalar, i.e. the one

which is serialized and persisted as part of the keypair. Instead Ed25519

implementations generally flip these bits immediately prior to use, either

for signing or deriving the public key.

[...]

In these schemes, it's not possible to "clamp" the derived scalar

immediately prior to signing ("post-clamping" I guess?), as this would

result in a different public key (i.e. the math simply does not work out as

the groups are no longer commutative). Instead, if any clamping is to be

performed it must happen immediately to the parent scalar, and/or to any

scalars derived by both the public and private key holders in such a scheme.

Hi all,scalar[0] &= 248;

scalar[31] &= 63;

scalar[31] |= 64;

I've heard this referred to as "clamping" although that may not be the best

term.

These operations are not applied to the canonical scalar, i.e. the one

which is serialized and persisted as part of the keypair. Instead Ed25519

implementations generally flip these bits immediately prior to use, either

for signing or deriving the public key.

[...]

In these schemes, it's not possible to "clamp" the derived scalar

immediately prior to signing ("post-clamping" I guess?), as this would

result in a different public key (i.e. the math simply does not work out as

the groups are no longer commutative). Instead, if any clamping is to be

performed it must happen immediately to the parent scalar, and/or to any

scalars derived by both the public and private key holders in such a scheme.

this subject came up today at the Tor meeting in a discussion with Ian

Goldberg, George Kadianakis, Isis Lovecruft, and myself.

As Tony noted, using bit-twiddling to mask away the low bits of a scalar is

problematic because the bit-twiddling is not well-defined `(mod l)`. (Here `l`

is the order of the basepoint, so the full group has order `8*l`). This means

that the "clamping" is not compatible with any arithmetic operations on

scalars.

We (Ian, George, Isis, and myself) have the following suggestion.

Define a "torsion-safe representative" of `a \in Z` to be an integer `a' \in

Z` such that `a ≡ a' (mod l)` and `a' ≡ 0 (mod 8)`.

This means that `a B = a' B` for the basepoint `B`, but `a' T = id` for `T` a

torsion point, so accidentally multiplying by a torsion point can't leak

information. However, unlike "clamping", this operation is actually

well-defined, and leaves the scalar unchanged, in the sense that scalar

multiplication by `a` and by `a'` are the same on the prime-order subgroup.

How do we compute such a representative? Since (using the CRT)

k := 3l + 1 ≡ 1 (mod l)

≡ 0 (mod 8),

`k*a mod 8l` is a torsion-safe representative of `a`. Computing `k*a mod 8l`

directly is a problem because an implementation may only have arithmetic modulo

`l`, not `8l`. However,

k*a ≡ (3l+1)*a ≡ 3al + a (mod 8l),

so it's sufficient to compute `3al (mod 8l)` and add it to `a`. Since

3a mod 8 = 3a + 8n,

(3a mod 8)*l = 3al + 8ln ≡ 3al (mod 8l),

so it's sufficient to compute `3a mod 8` and use a lookup table of

[0, 1l, 2l, 3l, 4l, 5l, 6l, 7l]

to get `3al (mod 8l)`. Arranging the lookup table as

[0, 3l, 6l, 1l, 4l, 7l, 2l, 5l]

means that `a mod 8` indexes `3al (mod 8l)`. Therefore, computing a

torsion-safe representative for a scalar `a` just amounts to computing

a + permuted_lookup_table[a & 7]

in constant time. If the input `a` is reduced, so that `a < l`, then the

torsion-safe representative is bounded by `8l < 2^256` and therefore fits in 32

bytes.

This ends up being slightly more work than just bit-twiddling, but not by much,

and it's certainly insignificant compared to the cost of a scalar

multiplication. There's a prototype implementation here [0] in case anyone is

curious to see what it looks like.

Cheers,

Henry de Valence

[0]: https://github.com/hdevalence/curve25519-dalek/commit/2ae0bdb6df26a74ef46d4332b635c9f6290126c7

(subject to rebasing...)

The permuted lookup table is, explicitly:

sage: l = 2^252 + 27742317777372353535851937790883648493

sage: lookup = [((3 * i) % 8)*l for i in range(8)]

sage: lookup

[0,

21711016731996786641919559689128982722571349078139722818005852814856362752967,

43422033463993573283839119378257965445142698156279445636011705629712725505934,

7237005577332262213973186563042994240857116359379907606001950938285454250989,

28948022309329048855892746252171976963428465437519630424007803753141817003956,

50659039041325835497812305941300959685999814515659353242013656567998179756923,

14474011154664524427946373126085988481714232718759815212003901876570908501978,

36185027886661311069865932815214971204285581796899538030009754691427271254945]