By my measurements, curve25519-donna-c64 -O3 uses 840 bytes of stack on x86-64. With -O2, itâs 1128 bytes. But the x86 version uses much more stack, so maybe thatâs your problem.
I have a tiny implementation of Curve25519. According to -fstack-usage, it uses as few as 372 bytes of stack on x86-64 and 336 bytes on x86, depending on compilation options.
Since itâs optimized for size, it doesnât perform as well as Donna. The factor is 2-4 on x86-64 depending on compilation options, but only ~25% slower on x32 if Iâm measuring correctly. The code is relatively portable, detecting bit size using __SIZEOF_INT128__. My code also has ARM asm intrinsics, so it might outperform Donna on some ARM platforms. I havenât benched this.
My code also supports nonstandard x-only signature production and verification at the cost of slightly higher stack usage.
This implementation is part of a package that I wrote at work, so I canât share it with you yet. Iâm trying to get it open sourced under an MIT license, but I have to talk to legal about this. So itâs portable but not common. But let me know if you want it, it might help me get it through legal.
Post by Jason A. Donenfeld
I use a curve25519-donna variant inside of WireGuard . It runs in a
kthread in kernel space, which only has 8k of stack in total. Some
circuitous paths in the kernel into code actually amount to having
much less stack available. I could allocate curve25519 variables on
the heap instead, or try to do various other traditional programming
techniques to reduce usage. But before I put too much time into that,
I was wondering if anybody else has ran into this limitation with
-donna and if there are other common portable implementations of
curve25519 that use less stack while remaining performant, or if there
are various other tricks to reduce stack usage.
Curves mailing list