Implementation notes: amd64, renoir, crypto_scalarmult/kummer

Computer: renoir
Microarchitecture: amd64; Zen 2 (860f01)
Architecture: amd64
CPU ID: AuthenticAMD-00860f01-178bfbff
SUPERCOP version: 20240625
Operation: crypto_scalarmult
Primitive: kummer
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
24197518187 0 040712 820 1752avx2intclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
24200220677 0 043336 820 1752avx2intclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2496509681 0 032560 820 1752avxclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2500059528 0 033152 788 1784avxgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2501249341 0 030038 812 1720avxclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2505228995 0 031888 820 1752avx2clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2506928842 0 032480 788 1784avx2gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2507468995 0 031728 820 1752avx2clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2512389229 0 029056 780 1752avxgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2513178457 0 028910 812 1720avx2clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2513608655 0 029366 812 1720avx2clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2514049681 0 032368 820 1752avxclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2514298506 0 029464 788 1784avx2gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2515459143 0 029550 812 1720avxclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2516009288 0 031216 788 1784avxgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2517508543 0 028384 780 1752avx2gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2518878602 0 030544 788 1784avx2gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2521268891 0 031824 820 1720avx2clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2521639192 0 030136 788 1784avxgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2526549577 0 032496 820 1720avxclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2694988883 0 029864 788 1784avx2intgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2748179391 0 031368 788 1784avx2intgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2770738551 0 028886 812 1720avx2intclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2828506332 0 026990 812 1720avx2intclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
2828806219 0 026072 780 1752avx2intgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
81103613969 0 036744 820 1752ref5uclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
82429812487 0 035120 820 1752ref5uclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
8864034753 0 026632 788 1784ref5ugcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
8901564363 0 024694 812 1720ref5uclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
8954167573 0 031080 788 1784ref5gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
9089324453 0 026304 788 1784ref5gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
94167414046 0 036840 820 1720ref5clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
95237314298 0 037136 820 1720ref5uclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
9604607937 0 031528 788 1784ref5ugcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
98023614197 0 036936 820 1752ref5clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
9850334424 0 024184 780 1752ref5ugcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
98508512533 0 035200 820 1752ref5clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
9914484155 0 023928 780 1752ref5gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
9914714449 0 025344 788 1784ref5gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
9966494440 0 024758 812 1720ref5clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
10041794765 0 025672 788 1784ref5ugcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
10226554790 0 025414 812 1720ref5uclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
10295974524 0 025118 812 1720ref5clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625

Test failure


error 111
crypto_scalarmult not associative

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
avx2intgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

Compiler output


gfe.c: gfe.c: In function 'fromdouble':
gfe.c: gfe.c:71:11: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
gfe.c:    71 |   return *(unsigned long long *) &d - 0x4338000000000000;
gfe.c:       |           ^~~~~~~~~~~~~~~~~~~~~~~~~
gfe.c: gfe.c: In function 'todouble':
gfe.c: gfe.c:77:11: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
gfe.c:    77 |   return *(double *) &l - 6755399441055744.0;
gfe.c:       |           ^~~~~~~~~~~~~

Number of similar (implementation,compiler) pairs: 6, namely:
ImplementationCompiler
avxgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avxgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avxgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

Compiler output


smult.c: smult.c:36:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c:   t0 = _mm256_mul_epi32(a->v[0],*b);
smult.c:        ^
smult.c: smult.c:36:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:37:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c:   t1 = _mm256_mul_epi32(a->v[1],*b);
smult.c:        ^
smult.c: smult.c:37:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:38:30: error: always_inline function '_mm256_srli_epi64' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c:     t1 = _mm256_add_epi64(t1,_mm256_srli_epi64(t0,26)); t0 &= mask26;
smult.c:                              ^
smult.c: smult.c:38:30: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:38:10: error: always_inline function '_mm256_add_epi64' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c:     t1 = _mm256_add_epi64(t1,_mm256_srli_epi64(t0,26)); t0 &= mask26;
smult.c:          ^
smult.c: smult.c:38:10: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:39:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c:   t2 = _mm256_mul_epi32(a->v[2],*b);
smult.c:        ^
smult.c: smult.c:39:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:40:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c:   t3 = _mm256_mul_epi32(a->v[3],*b);
smult.c:        ^
smult.c: smult.c:40:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:41:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
avx2intclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

Compiler output


smult.c: smult.c: In function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_from_gfe':
smult.c: smult.c:14:6: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
smult.c:    14 |     0[(crypto_uint64 *) &y->v[i]] = x[0].v[i];
smult.c:       |      ^
smult.c: smult.c: In function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_to_gfe':
smult.c: smult.c:26:18: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
smult.c:    26 |     x[0].v[i] = 0[(crypto_uint64 *) &y->v[i]];
smult.c:       |                  ^
smult.c: smult.c: In function 'crypto_scalarmult_kummer_avx2int_constbranchindex':
smult.c: smult.c:162:6: warning: 'input.v[0]' is used uninitialized in this function [-Wuninitialized]
smult.c:   162 |   a0 = a->v[0];
smult.c:       |   ~~~^~~~~~~~~
smult.c: smult.c:174:6: warning: 'input.v[1]' is used uninitialized in this function [-Wuninitialized]
smult.c:   174 |   a1 = a->v[1];
smult.c:       |   ~~~^~~~~~~~~
smult.c: smult.c:187:6: warning: 'input.v[2]' is used uninitialized in this function [-Wuninitialized]
smult.c:   187 |   a2 = a->v[2];
smult.c:       |   ~~~^~~~~~~~~
smult.c: smult.c:200:6: warning: 'input.v[3]' is used uninitialized in this function [-Wuninitialized]
smult.c:   200 |   a3 = a->v[3];
smult.c:       |   ~~~^~~~~~~~~
smult.c: smult.c:213:6: warning: 'input.v[4]' is used uninitialized in this function [-Wuninitialized]
smult.c:   213 |   a4 = a->v[4];
smult.c:       |   ~~~^~~~~~~~~

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
avx2intgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

Compiler output


smult.c: smult.c: In function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_from_gfe':
smult.c: smult.c:14:6: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
smult.c:    14 |     0[(crypto_uint64 *) &y->v[i]] = x[0].v[i];
smult.c:       |      ^
smult.c: smult.c: In function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_to_gfe':
smult.c: smult.c:26:18: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
smult.c:    26 |     x[0].v[i] = 0[(crypto_uint64 *) &y->v[i]];
smult.c:       |                  ^

Number of similar (implementation,compiler) pairs: 2, namely:
ImplementationCompiler
avx2intgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2intgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

Passed TIMECOP


TIMECOP iterations: 10

Number of similar (implementation,compiler) pairs: 43, namely:
ImplementationCompiler
avxclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avxclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avxclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avxclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avxclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avxgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avxgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avxgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avxgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2intclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2intclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2intclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2intclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2intgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2intgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2intgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
ref5clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
ref5gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
ref5gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
ref5gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
ref5uclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5uclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5uclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5uclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5uclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5ugcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
ref5ugcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
ref5ugcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
ref5ugcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)