Discussion:
Torsion-safe representatives (was: Ed25519 "clamping" and its effect on hierarchical key derivation)
Henry de Valence
2017-03-25 19:43:52 UTC
Ed25519 performs the following operations on private scalars immediately
scalar[0] &= 248;
scalar[31] &= 63;
scalar[31] |= 64;
I've heard this referred to as "clamping" although that may not be the best
term.
These operations are not applied to the canonical scalar, i.e. the one
which is serialized and persisted as part of the keypair. Instead Ed25519
implementations generally flip these bits immediately prior to use, either
for signing or deriving the public key.
[...]
In these schemes, it's not possible to "clamp" the derived scalar
immediately prior to signing ("post-clamping" I guess?), as this would
result in a different public key (i.e. the math simply does not work out as
the groups are no longer commutative). Instead, if any clamping is to be
performed it must happen immediately to the parent scalar, and/or to any
scalars derived by both the public and private key holders in such a scheme.
Hi all,

this subject came up today at the Tor meeting in a discussion with Ian
Goldberg, George Kadianakis, Isis Lovecruft, and myself.

As Tony noted, using bit-twiddling to mask away the low bits of a scalar is
problematic because the bit-twiddling is not well-defined (mod l). (Here l
is the order of the basepoint, so the full group has order 8*l). This means
that the "clamping" is not compatible with any arithmetic operations on
scalars.

We (Ian, George, Isis, and myself) have the following suggestion.

Define a "torsion-safe representative" of a \in Z to be an integer a' \in
Z such that a ≡ a' (mod l) and a' ≡ 0 (mod 8).

This means that a B = a' B for the basepoint B, but a' T = id for T a
torsion point, so accidentally multiplying by a torsion point can't leak
information. However, unlike "clamping", this operation is actually
well-defined, and leaves the scalar unchanged, in the sense that scalar
multiplication by a and by a' are the same on the prime-order subgroup.

How do we compute such a representative? Since (using the CRT)

k := 3l + 1 ≡ 1 (mod l)
≡ 0 (mod 8),

k*a mod 8l is a torsion-safe representative of a. Computing k*a mod 8l
directly is a problem because an implementation may only have arithmetic modulo
l, not 8l. However,

k*a ≡ (3l+1)*a ≡ 3al + a (mod 8l),

so it's sufficient to compute 3al (mod 8l) and add it to a. Since

3a mod 8 = 3a + 8n,
(3a mod 8)*l = 3al + 8ln ≡ 3al (mod 8l),

so it's sufficient to compute 3a mod 8 and use a lookup table of

[0, 1l, 2l, 3l, 4l, 5l, 6l, 7l]

to get 3al (mod 8l). Arranging the lookup table as

[0, 3l, 6l, 1l, 4l, 7l, 2l, 5l]

means that a mod 8 indexes 3al (mod 8l). Therefore, computing a
torsion-safe representative for a scalar a just amounts to computing

a + permuted_lookup_table[a & 7]

in constant time. If the input a is reduced, so that a < l, then the
torsion-safe representative is bounded by 8l < 2^256 and therefore fits in 32
bytes.

This ends up being slightly more work than just bit-twiddling, but not by much,
and it's certainly insignificant compared to the cost of a scalar
multiplication. There's a prototype implementation here [0] in case anyone is
curious to see what it looks like.

Cheers,
Henry de Valence

[0]: https://github.com/hdevalence/curve25519-dalek/commit/2ae0bdb6df26a74ef46d4332b635c9f6290126c7
(subject to rebasing...)

The permuted lookup table is, explicitly:

sage: l = 2^252 + 27742317777372353535851937790883648493
sage: lookup = [((3 * i) % 8)*l for i in range(8)]
sage: lookup
[0,
21711016731996786641919559689128982722571349078139722818005852814856362752967,
43422033463993573283839119378257965445142698156279445636011705629712725505934,
7237005577332262213973186563042994240857116359379907606001950938285454250989,
28948022309329048855892746252171976963428465437519630424007803753141817003956,
50659039041325835497812305941300959685999814515659353242013656567998179756923,
14474011154664524427946373126085988481714232718759815212003901876570908501978,
36185027886661311069865932815214971204285581796899538030009754691427271254945]
Mike Hamburg
2017-03-26 05:56:51 UTC
More straightforwardly, let a' = h (a/h mod l) where h is the cofactor. When h is a power of 2, this is especially easy to compute either with Montgomery multiplication or by iterated halving.

This is also required for certain scalarmul techniques, like going to an isogenous twisted Edwards curve.

I bet in some contexts this is the simplest thing to do. For x25519 though it's more complex than clamping. For ed25519 it's probably good enough to just do everything mod l.

-- Mike

Sent from my phone. Please excuse brevity and typos.
Post by Henry de Valence
Ed25519 performs the following operations on private scalars immediately
scalar[0] &= 248;
scalar[31] &= 63;
scalar[31] |= 64;
I've heard this referred to as "clamping" although that may not be the best
term.
These operations are not applied to the canonical scalar, i.e. the one
which is serialized and persisted as part of the keypair. Instead Ed25519
implementations generally flip these bits immediately prior to use, either
for signing or deriving the public key.
[...]
In these schemes, it's not possible to "clamp" the derived scalar
immediately prior to signing ("post-clamping" I guess?), as this would
result in a different public key (i.e. the math simply does not work out as
the groups are no longer commutative). Instead, if any clamping is to be
performed it must happen immediately to the parent scalar, and/or to any
scalars derived by both the public and private key holders in such a scheme.
Hi all,
this subject came up today at the Tor meeting in a discussion with Ian
Goldberg, George Kadianakis, Isis Lovecruft, and myself.
As Tony noted, using bit-twiddling to mask away the low bits of a scalar is
problematic because the bit-twiddling is not well-defined (mod l). (Here l
is the order of the basepoint, so the full group has order 8*l). This means
that the "clamping" is not compatible with any arithmetic operations on
scalars.
We (Ian, George, Isis, and myself) have the following suggestion.
Define a "torsion-safe representative" of a \in Z to be an integer a' \in
Z such that a â¡ a' (mod l) and a' â¡ 0 (mod 8).
This means that a B = a' B for the basepoint B, but a' T = id for T a
torsion point, so accidentally multiplying by a torsion point can't leak
information. However, unlike "clamping", this operation is actually
well-defined, and leaves the scalar unchanged, in the sense that scalar
multiplication by a and by a' are the same on the prime-order subgroup.
How do we compute such a representative? Since (using the CRT)
k := 3l + 1 â¡ 1 (mod l)
â¡ 0 (mod 8),
k*a mod 8l is a torsion-safe representative of a. Computing k*a mod 8l
directly is a problem because an implementation may only have arithmetic modulo
l, not 8l. However,
k*a â¡ (3l+1)*a â¡ 3al + a (mod 8l),
so it's sufficient to compute 3al (mod 8l) and add it to a. Since
3a mod 8 = 3a + 8n,
(3a mod 8)*l = 3al + 8ln â¡ 3al (mod 8l),
so it's sufficient to compute 3a mod 8 and use a lookup table of
[0, 1l, 2l, 3l, 4l, 5l, 6l, 7l]
to get 3al (mod 8l). Arranging the lookup table as
[0, 3l, 6l, 1l, 4l, 7l, 2l, 5l]
means that a mod 8 indexes 3al (mod 8l). Therefore, computing a
torsion-safe representative for a scalar a just amounts to computing
a + permuted_lookup_table[a & 7]
in constant time. If the input a is reduced, so that a < l, then the
torsion-safe representative is bounded by 8l < 2^256 and therefore fits in 32
bytes.
This ends up being slightly more work than just bit-twiddling, but not by much,
and it's certainly insignificant compared to the cost of a scalar
multiplication. There's a prototype implementation here [0] in case anyone is
curious to see what it looks like.
Cheers,
Henry de Valence
[0]: https://github.com/hdevalence/curve25519-dalek/commit/2ae0bdb6df26a74ef46d4332b635c9f6290126c7
(subject to rebasing...)
sage: l = 2^252 + 27742317777372353535851937790883648493
sage: lookup = [((3 * i) % 8)*l for i in range(8)]
sage: lookup
[0,
21711016731996786641919559689128982722571349078139722818005852814856362752967,
43422033463993573283839119378257965445142698156279445636011705629712725505934,
7237005577332262213973186563042994240857116359379907606001950938285454250989,
28948022309329048855892746252171976963428465437519630424007803753141817003956,
50659039041325835497812305941300959685999814515659353242013656567998179756923,
14474011154664524427946373126085988481714232718759815212003901876570908501978,
36185027886661311069865932815214971204285581796899538030009754691427271254945]
_______________________________________________
Curves mailing list
https://moderncrypto.org/mailman/listinfo/curves
Oleg Andreev
2017-03-27 18:04:40 UTC
Henry, the technique you showed is fantastic, I love it!

I have a lame question, though. You mention that a*B = a'*B holds for the base point. But is it also true for any point in the B's subgroup? The reason I ask is that I need to have not just regular EdDSA signatures, but also DLEQs (discrete log equality proofs) with random generator points.

Thank you!
Post by Henry de Valence
Hi all,
this subject came up today at the Tor meeting in a discussion with Ian
Goldberg, George Kadianakis, Isis Lovecruft, and myself.
As Tony noted, using bit-twiddling to mask away the low bits of a scalar is
problematic because the bit-twiddling is not well-defined (mod l). (Here l
is the order of the basepoint, so the full group has order 8*l). This means
that the "clamping" is not compatible with any arithmetic operations on
scalars.
We (Ian, George, Isis, and myself) have the following suggestion.
Define a "torsion-safe representative" of a \in Z to be an integer a' \in
Z such that a ≡ a' (mod l) and a' ≡ 0 (mod 8).
This means that a B = a' B for the basepoint B, but a' T = id for T a
torsion point, so accidentally multiplying by a torsion point can't leak
information. However, unlike "clamping", this operation is actually
well-defined, and leaves the scalar unchanged, in the sense that scalar
multiplication by a and by a' are the same on the prime-order subgroup.
How do we compute such a representative? Since (using the CRT)
k := 3l + 1 ≡ 1 (mod l)
≡ 0 (mod 8),
k*a mod 8l is a torsion-safe representative of a. Computing k*a mod 8l
directly is a problem because an implementation may only have arithmetic modulo
l, not 8l. However,
k*a ≡ (3l+1)*a ≡ 3al + a (mod 8l),
so it's sufficient to compute 3al (mod 8l) and add it to a. Since
3a mod 8 = 3a + 8n,
(3a mod 8)*l = 3al + 8ln ≡ 3al (mod 8l),
so it's sufficient to compute 3a mod 8 and use a lookup table of
[0, 1l, 2l, 3l, 4l, 5l, 6l, 7l]
to get 3al (mod 8l). Arranging the lookup table as
[0, 3l, 6l, 1l, 4l, 7l, 2l, 5l]
means that a mod 8 indexes 3al (mod 8l). Therefore, computing a
torsion-safe representative for a scalar a just amounts to computing
a + permuted_lookup_table[a & 7]
in constant time. If the input a is reduced, so that a < l, then the
torsion-safe representative is bounded by 8l < 2^256 and therefore fits in 32
bytes.
This ends up being slightly more work than just bit-twiddling, but not by much,
and it's certainly insignificant compared to the cost of a scalar
multiplication. There's a prototype implementation here [0] in case anyone is
curious to see what it looks like.
Cheers,
Henry de Valence
[0]: https://github.com/hdevalence/curve25519-dalek/commit/2ae0bdb6df26a74ef46d4332b635c9f6290126c7
(subject to rebasing...)
sage: l = 2^252 + 27742317777372353535851937790883648493
sage: lookup = [((3 * i) % 8)*l for i in range(8)]
sage: lookup
[0,
21711016731996786641919559689128982722571349078139722818005852814856362752967,
43422033463993573283839119378257965445142698156279445636011705629712725505934,
7237005577332262213973186563042994240857116359379907606001950938285454250989,
28948022309329048855892746252171976963428465437519630424007803753141817003956,
50659039041325835497812305941300959685999814515659353242013656567998179756923,
14474011154664524427946373126085988481714232718759815212003901876570908501978,
36185027886661311069865932815214971204285581796899538030009754691427271254945]
_______________________________________________
Curves mailing list
https://moderncrypto.org/mailman/listinfo/curves
Mike Hamburg
2017-03-27 18:06:56 UTC
Post by Oleg Andreev
Henry, the technique you showed is fantastic, I love it!
I have a lame question, though. You mention that a*B = a'*B holds for the base point. But is it also true for any point in the B's subgroup?
Yes.
Post by Oleg Andreev
The reason I ask is that I need to have not just regular EdDSA signatures, but also DLEQs (discrete log equality proofs) with random generator points.
Thank you!
Post by Henry de Valence
Hi all,
this subject came up today at the Tor meeting in a discussion with Ian
Goldberg, George Kadianakis, Isis Lovecruft, and myself.
As Tony noted, using bit-twiddling to mask away the low bits of a scalar is
problematic because the bit-twiddling is not well-defined (mod l). (Here l
is the order of the basepoint, so the full group has order 8*l). This means
that the "clamping" is not compatible with any arithmetic operations on
scalars.
We (Ian, George, Isis, and myself) have the following suggestion.
Define a "torsion-safe representative" of a \in Z to be an integer a' \in
Z such that a â¡ a' (mod l) and a' â¡ 0 (mod 8).
This means that a B = a' B for the basepoint B, but a' T = id for T a
torsion point, so accidentally multiplying by a torsion point can't leak
information. However, unlike "clamping", this operation is actually
well-defined, and leaves the scalar unchanged, in the sense that scalar
multiplication by a and by a' are the same on the prime-order subgroup.
How do we compute such a representative? Since (using the CRT)
k := 3l + 1 â¡ 1 (mod l)
â¡ 0 (mod 8),
k*a mod 8l is a torsion-safe representative of a. Computing k*a mod 8l
directly is a problem because an implementation may only have arithmetic modulo
l, not 8l. However,
k*a â¡ (3l+1)*a â¡ 3al + a (mod 8l),
so it's sufficient to compute 3al (mod 8l) and add it to a. Since
3a mod 8 = 3a + 8n,
(3a mod 8)*l = 3al + 8ln â¡ 3al (mod 8l),
so it's sufficient to compute 3a mod 8 and use a lookup table of
[0, 1l, 2l, 3l, 4l, 5l, 6l, 7l]
to get 3al (mod 8l). Arranging the lookup table as
[0, 3l, 6l, 1l, 4l, 7l, 2l, 5l]
means that a mod 8 indexes 3al (mod 8l). Therefore, computing a
torsion-safe representative for a scalar a just amounts to computing
a + permuted_lookup_table[a & 7]
in constant time. If the input a is reduced, so that a < l, then the
torsion-safe representative is bounded by 8l < 2^256 and therefore fits in 32
bytes.
This ends up being slightly more work than just bit-twiddling, but not by much,
and it's certainly insignificant compared to the cost of a scalar
multiplication. There's a prototype implementation here [0] in case anyone is
curious to see what it looks like.
Cheers,
Henry de Valence
[0]: https://github.com/hdevalence/curve25519-dalek/commit/2ae0bdb6df26a74ef46d4332b635c9f6290126c7
(subject to rebasing...)
sage: l = 2^252 + 27742317777372353535851937790883648493
sage: lookup = [((3 * i) % 8)*l for i in range(8)]
sage: lookup
[0,
21711016731996786641919559689128982722571349078139722818005852814856362752967,
43422033463993573283839119378257965445142698156279445636011705629712725505934,
7237005577332262213973186563042994240857116359379907606001950938285454250989,
28948022309329048855892746252171976963428465437519630424007803753141817003956,
50659039041325835497812305941300959685999814515659353242013656567998179756923,
14474011154664524427946373126085988481714232718759815212003901876570908501978,
36185027886661311069865932815214971204285581796899538030009754691427271254945]
_______________________________________________
Curves mailing list
https://moderncrypto.org/mailman/listinfo/curves
_______________________________________________
Curves mailing list
https://moderncrypto.org/mailman/listinfo/curves <https://moderncrypto.org/mailman/listinfo/curves>
Oleg Andreev
2017-03-27 20:00:49 UTC
I have a lame question, though. You mention that a*B = a'*B holds