Discussion:
Ed25519 "clamping" and its effect on hierarchical key derivation
(too old to reply)
Tony Arcieri
2017-03-06 19:36:14 UTC
Permalink
Ed25519 performs the following operations on private scalars immediately
prior to use:

scalar[0] &= 248;
scalar[31] &= 63;
scalar[31] |= 64;


I've heard this referred to as "clamping" although that may not be the best
term.

These operations are not applied to the canonical scalar, i.e. the one
which is serialized and persisted as part of the keypair. Instead Ed25519
implementations generally flip these bits immediately prior to use, either
for signing or deriving the public key.

From what I've read, this bitflipping is intended to accomplish the
following:

- Prevent small subgroup attacks: flipping these bits makes the scalar a
multiple of the cofactor, which I understand is supposed to help prevent
small subgroup attacks. However, reading about these attacks, they don't
seem to help the attacker very much.
- Defense against deficient implementations: I'm not sure I really
understand the rationale here, but my understanding is there are certain
classes of implementation defects this helps guard against.

So far, it's been pretty troublesome finding a really good explanation of
what this bitflipping is actually for or anyone who feels particularly
strongly as to its importance. The general reaction I've gotten asking
about them is akin to "djb flipped these bits... he must have his reasons".

Flipping these bits is particularly troublesome for implementing
hierarchical key derivation schemes (e.g. semiprivate keys, BIP-32) which
rely on commutative groups to allow holders of master public keys to derive
child public keys by multiplying by scalar values which can be derived by
both the public and private key holders (a.k.a. "hardened" vs
"non-hardened" derivation in BIP-32 schemes).

In these schemes, it's not possible to "clamp" the derived scalar
immediately prior to signing ("post-clamping" I guess?), as this would
result in a different public key (i.e. the math simply does not work out as
the groups are no longer commutative). Instead, if any clamping is to be
performed it must happen immediately to the parent scalar, and/or to any
scalars derived by both the public and private key holders in such a scheme.

My intuition is that "clamping" more than once is detrimental in that the
clamping operation weakens the key. Not really being sure about the purpose
though, I'm not sure if "pre-clamping" is sufficient to guard against these
attacks.

I'm very curious if "clamping" can simply be omitted, as it seems to
largely be a defense-in-depth measure which guards against a handful of
low-severity theoretical attacks. It complicates HKD schemes and, done
incorrectly in the context of such a scheme, I'm worried it might actually
harm security.
--
Tony Arcieri
z***@manian.org
2017-03-06 20:04:52 UTC
Permalink
This clamping question also impacts how Tor's Next Generation Onion
services will blind keys from the Onion Services Directory authorities.
Post by Tony Arcieri
Ed25519 performs the following operations on private scalars immediately
scalar[0] &= 248;
scalar[31] &= 63;
scalar[31] |= 64;
I've heard this referred to as "clamping" although that may not be the
best term.
These operations are not applied to the canonical scalar, i.e. the one
which is serialized and persisted as part of the keypair. Instead Ed25519
implementations generally flip these bits immediately prior to use, either
for signing or deriving the public key.
From what I've read, this bitflipping is intended to accomplish the
- Prevent small subgroup attacks: flipping these bits makes the scalar
a multiple of the cofactor, which I understand is supposed to help prevent
small subgroup attacks. However, reading about these attacks, they don't
seem to help the attacker very much.
- Defense against deficient implementations: I'm not sure I really
understand the rationale here, but my understanding is there are certain
classes of implementation defects this helps guard against.
So far, it's been pretty troublesome finding a really good explanation of
what this bitflipping is actually for or anyone who feels particularly
strongly as to its importance. The general reaction I've gotten asking
about them is akin to "djb flipped these bits... he must have his reasons".
Flipping these bits is particularly troublesome for implementing
hierarchical key derivation schemes (e.g. semiprivate keys, BIP-32) which
rely on commutative groups to allow holders of master public keys to derive
child public keys by multiplying by scalar values which can be derived by
both the public and private key holders (a.k.a. "hardened" vs
"non-hardened" derivation in BIP-32 schemes).
In these schemes, it's not possible to "clamp" the derived scalar
immediately prior to signing ("post-clamping" I guess?), as this would
result in a different public key (i.e. the math simply does not work out as
the groups are no longer commutative). Instead, if any clamping is to be
performed it must happen immediately to the parent scalar, and/or to any
scalars derived by both the public and private key holders in such a scheme.
My intuition is that "clamping" more than once is detrimental in that the
clamping operation weakens the key. Not really being sure about the purpose
though, I'm not sure if "pre-clamping" is sufficient to guard against these
attacks.
I'm very curious if "clamping" can simply be omitted, as it seems to
largely be a defense-in-depth measure which guards against a handful of
low-severity theoretical attacks. It complicates HKD schemes and, done
incorrectly in the context of such a scheme, I'm worried it might actually
harm security.
--
Tony Arcieri
_______________________________________________
Curves mailing list
https://moderncrypto.org/mailman/listinfo/curves
Gregory Maxwell
2017-03-06 20:23:33 UTC
Permalink
Post by Tony Arcieri
Ed25519 performs the following operations on private scalars immediately
I assume the bytes of the scalar here is written least significant
first; otherwise I can't make sense of your message.
Post by Tony Arcieri
scalar[0] &= 248;
This is making the number a multiple of 8, presumably due to the
cofactor. You can simply make your derivation scheme multiply its
scalar by the cofactor for this... Then everything is compatible.
Hurray.
Post by Tony Arcieri
scalar[31] &= 63;
scalar[31] |= 64;
This is clamping to ~the order and making the most significant bit 1.
Your application should already be assuring that the scalar is in
range for the group size.

Setting the most significant bit is a (IMO mildly offensive)
performance hack so that the exponentiation ladder does not need to
correctly handle the point at infinity. To deal with this the point
arithmetic likely needs to have a conditional move to handle
propagating through a point at infinity when the scalar's most
significant bits are a run of zeros. The only real solution to this
one is "don't do that optimization" (or do it and use an extra
addition and a conditional swap).

Good luck with mysterious failures when someone combines your
application which relaxes this restriction with found-on-the-internets
ed25519 code which expects it to be upheld.
Mike Hamburg
2017-03-06 21:20:47 UTC
Permalink
Post by Gregory Maxwell
Post by Tony Arcieri
Ed25519 performs the following operations on private scalars immediately
I assume the bytes of the scalar here is written least significant
first; otherwise I can't make sense of your message.
Yes. DJB crypto is generally little-endian.
Post by Gregory Maxwell
Post by Tony Arcieri
scalar[0] &= 248;
This is making the number a multiple of 8, presumably due to the
cofactor. You can simply make your derivation scheme multiply its
scalar by the cofactor for this... Then everything is compatible.
Hurray.
Right. And this step doesn’t matter for security in x25519, unless you’re checking for contributory behavior. This is because an attacker who sent points with a torsion component could recover the low bits of your key. You’ve stymied that attacker by setting those bits to 0
 which is basically the same result.

However, for hierarchical key derivation you should probably multiply by the cofactor as Gregory suggests. Otherwise you might have an attack based on the hidden number problem or something (eg, Bleichenbacher).
Post by Gregory Maxwell
Post by Tony Arcieri
scalar[31] &= 63;
scalar[31] |= 64;
This is clamping to ~the order and making the most significant bit 1.
Your application should already be assuring that the scalar is in
range for the group size.
Setting the most significant bit is a (IMO mildly offensive)
performance hack so that the exponentiation ladder does not need to
correctly handle the point at infinity. To deal with this the point
arithmetic likely needs to have a conditional move to handle
propagating through a point at infinity when the scalar's most
significant bits are a run of zeros. The only real solution to this
one is "don't do that optimization" (or do it and use an extra
addition and a conditional swap).
Good luck with mysterious failures when someone combines your
application which relaxes this restriction with found-on-the-internets
ed25519 code which expects it to be upheld.
I believe it turns out that for most implementations of the Montgomery ladder found on the internet, the leading bit doesn’t have to be 1 for correctness, and you don’t need any extra CMOVs or anything.

Setting the leading bit to 1 and the low bits to 0 does ensure that the scalar is not a multiple of q, and that you don’t pass through any multiples of q on the way. I’m not sure that restriction matters in practice, though, because the scalar should be random and therefore almost certainly not a multiple of q anyway. But in principle this could cause an incorrect result
 it just happens with negligible probability that an attacker can’t really amplify.

Cheers,
— Mike
Tony Arcieri
2017-03-09 18:54:49 UTC
Permalink
Thanks for the insights Gregory and Mike!

That said, I'd be curious what you think about a paper describing an
adaptation of BIP32 to Ed25519 I've recently been pointed at (shortly after
posting this thread):

https://github.com/WebOfTrustInfo/rebooting-the-web-of-trust-fall2016/blob/master/topics-and-advance-readings/HDKeys-Ed25519.pdf

They perform the typical clamping procedure on the root scalar, but also
ensure that the *third* highest bit is zero.

When deriving a child key, they use only the first 28-bytes / 224-bits of
the hash as the child scalar.

According to the rationale in section 4.6, this ensures the same clamping
invariants discussed earlier in this thread apply to child keys.
Trevor Perrin
2017-03-29 00:25:00 UTC
Permalink
Post by Tony Arcieri
I'd be curious what you think about a paper describing an
adaptation of BIP32 to Ed25519 I've recently been pointed at (shortly after
https://github.com/WebOfTrustInfo/rebooting-the-web-of-trust-fall2016/blob/master/topics-and-advance-readings/HDKeys-Ed25519.pdf
That Evernym paper is similar to Greg's suggestion [1] ("make your
derivation scheme multiply its scalar by the cofactor"), but also
arranges things so the high scalar bits are "clamped".

Mike pointed out that making your scalar a multiple of cofactor isn't
that important for traditional X25519 DH, since a small-subgroup
attack would only leak a few low bits of the scalar. But if
hierarchically-derived keys were used for DH then a few low bits from
related-but-different scalars might be leaked, which could be a
problem [2].

In addition to "clamping" the low bits to deal with small-subgroup
attacks, the other part of clamping is setting the high bits so (a)
the scalar is smaller than the subgroup order, and (b) so the highest
set bit is constant in case the scalar is used with a
non-constant-time scalarmult algorithm that leaks based on the highest
set bit [3].

For hierarchical key derivation it's easy to keep the scalar modulo
the subgroup order, so (a) is easy to deal with. I think (b) isn't
that important, since people aren't (or at least shouldn't be) writing
non-const-time scalarmult. Anyone disagree?

Anyways, Henry suggested another way of dealing with the
small-subgroup risk: Convert the scalar to a representative equivalent
to the original scalar (mod subgroup order), but zero (mod cofactor).
I could imagine that being useful in some protocols. But for
hierarchical key derivation, where you're deriving a new scalar
anyways, I'm not sure this has advantages versus multiplying by the
cofactor?

The Tor approach seems simplest [4]: ignore the clamping! As long as
the hierarchically-derived keys are used for signatures (not DH) then
Mike's concern is avoided.

If you wanted to use hierarchically-derived keys for DH later you
could define a cofactor-DH operation that multiplies by cofactor. In
the olden days, this is how NIST specified ECDH (see Cofactor DH in
SP800-56A [5]). DJB chose to include cofactors into the private keys
instead, presumably because bit-flipping was easier than multiplying,
for a freshly-generated scalar.

So maybe the question is how much you care about spending a little
extra effort in key derivation to make the keys a little safer with
existing DH software? I.e., do you multiply by the scalar as part of
derivation, or leave that for a future DH operation?


Trevor

[1] https://moderncrypto.org/mail-archive/curves/2017/000860.html
[2] https://moderncrypto.org/mail-archive/curves/2017/000861.html
[3] https://www.ietf.org/mail-archive/web/cfrg/current/msg05004.html
[4] https://moderncrypto.org/mail-archive/curves/2017/000866.html
[5] https://gitweb.torproject.org/torspec.git/tree/proposals/224-rend-spec-ng.txt#n1979
[6] http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-56Ar2.pdf
Trevor Perrin
2017-03-29 00:35:32 UTC
Permalink
Post by Trevor Perrin
So maybe the question is how much you care about spending a little
extra effort in key derivation to make the keys a little safer with
existing DH software? I.e., do you multiply by the scalar as part of
derivation, or leave that for a future DH operation?
typo in last sentence: "multiply by the *cofactor*".

Trevor
Tony Arcieri
2017-03-29 18:49:57 UTC
Permalink
Post by Trevor Perrin
So maybe the question is how much you care about spending a little
extra effort in key derivation to make the keys a little safer with
existing DH software? I.e., do you multiply by the scalar as part of
derivation, or leave that for a future DH operation?
This is what has always confused me: the clamping procedure used by Ed25519
seems "inherited" from X25519[1], ostensibly for some case where you may
want to take an Ed25519 key, convert it to an X25519 key, and use it for
D-H. Aside from libsodium providing an API for doing so, I haven't actually
seen anyone do this.

It seems like if you want to support a scheme which works for both
signatures and D-H, maybe it would be better to define the scheme in terms
of Montgomery, so it can be used directly with X25519, and then use
XEd25519 for signatures.

I think most people interested in an "Ed25519-BIP32"-style construction are
interested exclusively in signatures.

[1] See ("Computing secret keys") https://cr.yp.to/ecdh.html
--
Tony Arcieri
Trevor Perrin
2017-03-29 19:08:57 UTC
Permalink
Post by Tony Arcieri
This is what has always confused me: the clamping procedure used by Ed25519
seems "inherited" from X25519[1], ostensibly for some case where you may
want to take an Ed25519 key, convert it to an X25519 key, and use it for
D-H. Aside from libsodium providing an API for doing so, I haven't actually
seen anyone do this.
[...]
Post by Tony Arcieri
I think most people interested in an "Ed25519-BIP32"-style construction are
interested exclusively in signatures.
That's my impression as well.

Trevor
Ron Garret
2017-03-29 19:34:06 UTC
Permalink
Post by Trevor Perrin
So maybe the question is how much you care about spending a little
extra effort in key derivation to make the keys a little safer with
existing DH software? I.e., do you multiply by the scalar as part of
derivation, or leave that for a future DH operation?
This is what has always confused me: the clamping procedure used by Ed25519 seems "inherited" from X25519[1], ostensibly for some case where you may want to take an Ed25519 key, convert it to an X25519 key, and use it for D-H. Aside from libsodium providing an API for doing so, I haven't actually seen anyone do this.
SC4 does this. The idea is that it can help with identity binding. If Alice publishes a proof-of-identity signed by an Ed25519 key then Bob can send her an asynchronously encrypted message using the corresponding Curve25519 key to derive a shared secret. (Note that SC4 was designed before X3DH was published.)
Post by Trevor Perrin
It seems like if you want to support a scheme which works for both signatures and D-H, maybe it would be better to define the scheme in terms of Montgomery, so it can be used directly with X25519, and then use XEd25519 for signatures.
What would be the benefit? It seems to me that it doesn’t really matter much one way or the other, but if you’re going to convert one to the other then it seems to make more sense to derive the DH key from the DSA key because going the other way you lose the sign of the Y coordinate.

rg
Tony Arcieri
2017-03-29 21:01:24 UTC
Permalink
Post by Ron Garret
What would be the benefit?
The main benefit is that the D-H use case deals only with Montgomery-x
coordinates, and all scalar multiplications can be performed using the
Montgomery ladder.
Post by Ron Garret
It seems to me that it doesn’t really matter much one way or the other,
but if you’re going to convert one to the other then it seems to make more
sense to derive the DH key from the DSA key because going the other way you
lose the sign of the Y coordinate.
Trevor wrote a great post highlighting the respective tradeoffs here:

https://moderncrypto.org/mail-archive/curves/2015/000376.html
--
Tony Arcieri
Henry de Valence
2017-03-29 21:30:55 UTC
Permalink
Post by Trevor Perrin
Anyways, Henry suggested another way of dealing with the
small-subgroup risk: Convert the scalar to a representative equivalent
to the original scalar (mod subgroup order), but zero (mod cofactor).
I could imagine that being useful in some protocols. But for
hierarchical key derivation, where you're deriving a new scalar
anyways, I'm not sure this has advantages versus multiplying by the
cofactor?
My understanding is that in the HKD context, people want to view scalars as
elements of Z/lZ so that they can do arithmetic on them. The point of a safe
representative is that you're not really "converting" anything, you're just
choosing a different representative of the same equivalence class. From the
point of view of the basepoint and the subgroup, it's exactly the same scalar.

So the advantage is that you can view scalars as elements of Z/lZ, and
then just choose a safe representative whenever you want to use one.

This seems simpler than multiplying by the cofactor, which has to be done in Z,
not Z/lZ, and therefore requires thinking about which operations are done
modulo l and which aren't.

Henry

P.S.: To be clear, the idea wasn't just mine, it came from a discussion with
Ian, Isis, and George -- although I can be held solely responsible for breaking
the Reply headers (sorry!)

Loading...