Implementation notes: amd64, latour, crypto_sign/ed448goldilocks

Computer: latour
Architecture: amd64
CPU ID: GenuineIntel-000006fb-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_sign
Primitive: ed448goldilocks
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
153763270354 24 2192489426 928 23544T:amd64clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
154026970354 24 2192489426 928 23544T:amd64clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
154169172202 24 2192492194 928 23544T:amd64clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
154208771661 24 2192491474 928 23544T:amd64clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
158384754255 24 2192472248 920 23544T:amd64clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
175599974635 24 2192497630 872 23576T:amd64gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
181187160295 24 2192480574 872 23576T:amd64gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
183367897869 24 21924117234 928 23544T:64clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
183555997869 24 21924117234 928 23544T:64clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
183736899672 24 21924119746 928 23544T:64clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
1837665100197 24 21924120514 928 23544T:64clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
1867761119301 24 21924142358 872 23576T:64gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
189305161240 24 2192480616 920 23544T:64clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
193617979504 24 21924100518 872 23576T:64gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
197535674759 24 2192495118 872 23576T:64gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
198666042519 24 2192462070 864 23544T:64gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
405274566244 24 1885287158 872 20504T:32gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
415185388849 24 18852111830 872 20504T:32gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
501243383535 24 18852102450 928 20472T:32clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
517031183535 24 18852102450 928 20472T:32clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
520884985402 24 18852105010 928 20472T:32clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
528514286007 24 18852105858 928 20472T:32clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
538319764165 24 1885284414 872 20504T:32gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
548820052954 24 1885272206 864 20472T:32gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
564286556735 24 1885274720 920 20472T:32clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826

Test failure

Implementation: T:amd64
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
error 111
crypto_sign is nondeterministic

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:amd64

Test failure

Implementation: T:amd64
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE
error 111

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:amd64

Compiler output

Implementation: T:neon
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
crandom.c: In file included from crandom.c:11:
crandom.c: In file included from ./magic.h:15:
crandom.c: ./p448.h:314:14: warning: implicit declaration of function 'vshr_n_u32' is invalid in C99 [-Wimplicit-function-declaration]
crandom.c: tmp = vshr_n_u32(aa[7],28);
crandom.c: ^
crandom.c: ./p448.h:318:17: warning: implicit declaration of function 'vsra_n_u32' is invalid in C99 [-Wimplicit-function-declaration]
crandom.c: aa[i] = vsra_n_u32(aa[i] & vmask, aa[i-1], 28);
crandom.c: ^
crandom.c: ./p448.h:320:31: warning: implicit declaration of function 'vrev64_u32' is invalid in C99 [-Wimplicit-function-declaration]
crandom.c: aa[0] = (aa[0] & vmask) + vrev64_u32(tmp) + (tmp&vm2);
crandom.c: ^
crandom.c: 3 warnings generated.
ec_point.c: In file included from ec_point.c:12:
ec_point.c: In file included from ./ec_point.h:13:
ec_point.c: ./p448.h:314:14: warning: implicit declaration of function 'vshr_n_u32' is invalid in C99 [-Wimplicit-function-declaration]
ec_point.c: tmp = vshr_n_u32(aa[7],28);
ec_point.c: ^
ec_point.c: ./p448.h:318:17: warning: implicit declaration of function 'vsra_n_u32' is invalid in C99 [-Wimplicit-function-declaration]
ec_point.c: aa[i] = vsra_n_u32(aa[i] & vmask, aa[i-1], 28);
ec_point.c: ^
ec_point.c: ./p448.h:320:31: warning: implicit declaration of function 'vrev64_u32' is invalid in C99 [-Wimplicit-function-declaration]
ec_point.c: aa[0] = (aa[0] & vmask) + vrev64_u32(tmp) + (tmp&vm2);
ec_point.c: ^
ec_point.c: 3 warnings generated.
goldilocks.c: In file included from goldilocks.c:15:
goldilocks.c: In file included from ./ec_point.h:13:
goldilocks.c: ./p448.h:314:14: warning: implicit declaration of function 'vshr_n_u32' is invalid in C99 [-Wimplicit-function-declaration]
goldilocks.c: tmp = vshr_n_u32(aa[7],28);
goldilocks.c: ^
goldilocks.c: ./p448.h:318:17: warning: implicit declaration of function 'vsra_n_u32' is invalid in C99 [-Wimplicit-function-declaration]
goldilocks.c: aa[i] = vsra_n_u32(aa[i] & vmask, aa[i-1], 28);
goldilocks.c: ^
goldilocks.c: ./p448.h:320:31: warning: implicit declaration of function 'vrev64_u32' is invalid in C99 [-Wimplicit-function-declaration]
goldilocks.c: aa[0] = (aa[0] & vmask) + vrev64_u32(tmp) + (tmp&vm2);
goldilocks.c: ^
goldilocks.c: 3 warnings generated.
magic.c: In file included from magic.c:5:
magic.c: In file included from ./field.h:11:
magic.c: In file included from ./magic.h:15:
magic.c: ./p448.h:314:14: warning: implicit declaration of function 'vshr_n_u32' is invalid in C99 [-Wimplicit-function-declaration]
magic.c: tmp = vshr_n_u32(aa[7],28);
magic.c: ^
magic.c: ./p448.h:318:17: warning: implicit declaration of function 'vsra_n_u32' is invalid in C99 [-Wimplicit-function-declaration]
magic.c: aa[i] = vsra_n_u32(aa[i] & vmask, aa[i-1], 28);
magic.c: ^
magic.c: ./p448.h:320:31: warning: implicit declaration of function 'vrev64_u32' is invalid in C99 [-Wimplicit-function-declaration]
magic.c: aa[0] = (aa[0] & vmask) + vrev64_u32(tmp) + (tmp&vm2);
magic.c: ^
magic.c: 3 warnings generated.
p448.c: In file included from p448.c:6:
p448.c: ./p448.h:314:14: warning: implicit declaration of function 'vshr_n_u32' is invalid in C99 [-Wimplicit-function-declaration]
p448.c: tmp = vshr_n_u32(aa[7],28);
p448.c: ^
p448.c: ./p448.h:318:17: warning: implicit declaration of function 'vsra_n_u32' is invalid in C99 [-Wimplicit-function-declaration]
p448.c: aa[i] = vsra_n_u32(aa[i] & vmask, aa[i-1], 28);
p448.c: ^
p448.c: ./p448.h:320:31: warning: implicit declaration of function 'vrev64_u32' is invalid in C99 [-Wimplicit-function-declaration]
p448.c: aa[0] = (aa[0] & vmask) + vrev64_u32(tmp) + (tmp&vm2);
p448.c: ^
p448.c: p448.c:19:36: error: invalid output constraint '+w' in asm
p448.c: __asm__ ("vadd.s64 %f0, %e0" : "+w"(x));
p448.c: ^
p448.c: p448.c:25:36: error: invalid output constraint '+w' in asm
p448.c: __asm__ ("vswp.s64 %e0, %f0" : "+w"(x));
p448.c: ^
p448.c: p448.c:31:36: error: invalid output constraint '+w' in asm
p448.c: __asm__ ("vswp.s64 %e0, %f0" : "+w"(x));
p448.c: ^
p448.c: p448.c:362:12: error: unknown register name 'q0' in asm
p448.c: :: "q0","q1","q2","q3",
p448.c: ^
p448.c: p448.c:564:12: error: unknown register name 'q0' in asm
p448.c: :: "q0","q1","q2","q3",
p448.c: ^
p448.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon

Compiler output

Implementation: T:neon
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
crandom.c: In file included from magic.h:15:0,
crandom.c: from crandom.c:11:
crandom.c: p448.h: In function 'p448_weak_reduce':
crandom.c: p448.h:314:14: warning: implicit declaration of function 'vshr_n_u32' [-Wimplicit-function-declaration]
crandom.c: tmp = vshr_n_u32(aa[7],28);
crandom.c: ^
crandom.c: p448.h:314:14: error: incompatible types when initializing type 'uint32x2_t {aka __vector(2) unsigned int}' using type 'int'
crandom.c: p448.h:318:17: warning: implicit declaration of function 'vsra_n_u32' [-Wimplicit-function-declaration]
crandom.c: aa[i] = vsra_n_u32(aa[i] & vmask, aa[i-1], 28);
crandom.c: ^
crandom.c: p448.h:318:15: error: incompatible types when assigning to type 'uint32x2_t {aka __vector(2) unsigned int}' from type 'int'
crandom.c: aa[i] = vsra_n_u32(aa[i] & vmask, aa[i-1], 28);
crandom.c: ^
crandom.c: p448.h:320:31: warning: implicit declaration of function 'vrev64_u32' [-Wimplicit-function-declaration]
crandom.c: aa[0] = (aa[0] & vmask) + vrev64_u32(tmp) + (tmp&vm2);
crandom.c: ^

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:neon
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:neon
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:neon
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:neon

Namespace violations

Implementation: T:32
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
barrett_field.o add_nr_ext_packed T
barrett_field.o barrett_deserialize T
barrett_field.o barrett_deserialize_and_reduce T
barrett_field.o barrett_mul_or_mac T
barrett_field.o barrett_negate T
barrett_field.o barrett_reduce T
barrett_field.o barrett_serialize T
barrett_field.o sub_nr_ext_packed T
crandom.o crandom_destroy T
crandom.o crandom_detect_features T
crandom.o crandom_features B
crandom.o crandom_generate T
crandom.o crandom_init_from_buffer T
crandom.o crandom_init_from_file T
ec_point.o add_tw_niels_to_tw_extensible T
ec_point.o add_tw_pniels_to_tw_extensible T
ec_point.o convert_affine_to_extensible T
ec_point.o convert_tw_affine_to_tw_extensible T
ec_point.o convert_tw_affine_to_tw_pniels T
ec_point.o convert_tw_extensible_to_tw_pniels T
ec_point.o convert_tw_niels_to_tw_extensible T
ec_point.o convert_tw_pniels_to_tw_extensible T
ec_point.o deserialize_affine T
ec_point.o deserialize_and_twist_approx T
ec_point.o deserialize_montgomery T
ec_point.o double_extensible T
ec_point.o double_tw_extensible T
ec_point.o elligator_2s_inject T
ec_point.o eq_affine T
ec_point.o eq_extensible T
ec_point.o eq_tw_extensible T
ec_point.o is_even_pt T
ec_point.o is_even_tw T
ec_point.o is_square T
ec_point.o montgomery_step T
ec_point.o p448_inverse T
ec_point.o p448_isr T
ec_point.o serialize_extensible T
ec_point.o serialize_montgomery T
ec_point.o set_identity_affine T
ec_point.o set_identity_extensible T
ec_point.o set_identity_tw_extensible T
ec_point.o sub_tw_niels_from_tw_extensible T
ec_point.o sub_tw_pniels_from_tw_extensible T
ec_point.o test_only_twist T
ec_point.o twist_and_double T
ec_point.o twist_even T
ec_point.o untwist_and_double T
ec_point.o untwist_and_double_and_serialize T
ec_point.o validate_affine T
ec_point.o validate_extensible T
ec_point.o validate_tw_extensible T
goldilocks.o goldilocks_derive_private_key T
goldilocks.o goldilocks_destroy_precomputed_public_key T
goldilocks.o goldilocks_init T
goldilocks.o goldilocks_keygen T
goldilocks.o goldilocks_precompute_public_key T
goldilocks.o goldilocks_private_to_public T
goldilocks.o goldilocks_shared_secret T
goldilocks.o goldilocks_shared_secret_precomputed T
goldilocks.o goldilocks_sign T
goldilocks.o goldilocks_underive_private_key T
goldilocks.o goldilocks_verify T
goldilocks.o goldilocks_verify_precomputed T
magic.o SCALARMUL_FIXED_WINDOW_ADJUSTMENT R
magic.o curve_prime_order D
magic.o goldilocks_base_point R
magic.o sqrt_d_minus_1 R
p448.o p448_deserialize T
p448.o p448_is_zero T
p448.o p448_mul T
p448.o p448_mulw T
p448.o p448_serialize T
p448.o p448_sqr T
p448.o p448_strong_reduce T
p448.o simultaneous_invert_p448 T
scalarmul.o destroy_fixed_base T
scalarmul.o linear_combo_combs_vt T
scalarmul.o linear_combo_var_fixed_vt T
scalarmul.o montgomery_ladder T
scalarmul.o precompute_fixed_base T
scalarmul.o precompute_fixed_base_wnaf T
scalarmul.o scalarmul T
scalarmul.o scalarmul_fixed_base T
scalarmul.o scalarmul_fixed_base_wnaf_vt T
scalarmul.o scalarmul_vlook T
scalarmul.o scalarmul_vt T
sha512.o sha512_final T
sha512.o sha512_init T
sha512.o sha512_update T

Number of similar (compiler,implementation) pairs: 25, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:32
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:32
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:32
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:32
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:32
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:32
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:32
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:32
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:32
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:64
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:64
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:64
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:64
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:64
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:64
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:64
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:64
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:64
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:amd64
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:amd64
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:amd64
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:amd64
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:amd64
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:amd64
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:amd64