Implementation notes: amd64, shoe, crypto_scalarmult/kummer

Computer: shoe
Microarchitecture: amd64; Broadwell+AES (306d4)
Architecture: amd64
CPU ID: GenuineIntel-000306d4-bfebfbff
SUPERCOP version: 20240716
Operation: crypto_scalarmult
Primitive: kummer
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
2673448436 0 029486 812 1752avx2intclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
26948820403 0 044760 820 1784avx2intclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
26968918136 0 042376 820 1784avx2intclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
2838289360 0 031840 788 1816avx2intgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
2910156216 0 026800 780 1784avx2intgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
2915786350 0 027774 812 1752avx2intclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
2926708885 0 030624 788 1816avx2intgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
2991958995 0 033424 820 1784avx2clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
2997208995 0 033520 820 1784avx2clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
2998308823 0 033080 788 1816avx2gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
2999998891 0 032592 820 1752avx2clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
3004008503 0 030192 788 1816avx2gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
3004908599 0 031080 788 1816avx2gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
3006018457 0 029606 812 1752avx2clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
3008128526 0 029080 780 1784avx2gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
3009468655 0 030102 812 1752avx2clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
3677139681 0 034064 820 1784avxclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
3683989681 0 034192 820 1784avxclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
3687629509 0 033752 788 1816avxgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
3692029143 0 030246 812 1752avxclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
3692889577 0 033264 820 1752avxclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
3694279189 0 030864 788 1816avxgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
3696009285 0 031752 788 1816avxgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
3698609212 0 029752 780 1784avxgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
3701749341 0 030774 812 1752avxclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024063020240625
86752613612 0 038064 820 1784ref5uclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
91338612287 0 036632 820 1784ref5uclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
9144366584 0 030768 788 1816ref5gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
9283714318 0 026712 788 1816ref5gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
94843813784 0 038208 820 1784ref5clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
94870414046 0 037616 820 1752ref5clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
9718634245 0 025326 812 1752ref5uclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
97943214298 0 037872 820 1752ref5uclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
98757312349 0 036696 820 1784ref5clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
10044246908 0 031104 788 1816ref5ugcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
10115584345 0 025438 812 1752ref5clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
10162704615 0 027032 788 1816ref5ugcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
10530644147 0 024624 780 1784ref5gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
10551494402 0 026040 788 1816ref5gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
10750544571 0 025942 812 1752ref5clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
10992914413 0 024920 780 1784ref5ugcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
11189904732 0 026392 788 1816ref5ugcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
11470404821 0 026198 812 1752ref5uclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716

Test failure


error 111
crypto_scalarmult not associative

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
avx2intgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

Compiler output


gfe.c: gfe.c: In function 'fromdouble':
gfe.c: gfe.c:71:11: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
gfe.c:    71 |   return *(unsigned long long *) &d - 0x4338000000000000;
gfe.c:       |           ^~~~~~~~~~~~~~~~~~~~~~~~~
gfe.c: gfe.c: In function 'todouble':
gfe.c: gfe.c:77:11: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
gfe.c:    77 |   return *(double *) &l - 6755399441055744.0;
gfe.c:       |           ^~~~~~~~~~~~~

Number of similar (implementation,compiler) pairs: 6, namely:
ImplementationCompiler
avxgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avxgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avxgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

Compiler output


smult.c: smult.c:36:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c:   t0 = _mm256_mul_epi32(a->v[0],*b);
smult.c:        ^
smult.c: smult.c:36:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:37:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c:   t1 = _mm256_mul_epi32(a->v[1],*b);
smult.c:        ^
smult.c: smult.c:37:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:38:30: error: always_inline function '_mm256_srli_epi64' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c:     t1 = _mm256_add_epi64(t1,_mm256_srli_epi64(t0,26)); t0 &= mask26;
smult.c:                              ^
smult.c: smult.c:38:30: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:38:10: error: always_inline function '_mm256_add_epi64' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c:     t1 = _mm256_add_epi64(t1,_mm256_srli_epi64(t0,26)); t0 &= mask26;
smult.c:          ^
smult.c: smult.c:38:10: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:39:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c:   t2 = _mm256_mul_epi32(a->v[2],*b);
smult.c:        ^
smult.c: smult.c:39:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:40:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c:   t3 = _mm256_mul_epi32(a->v[3],*b);
smult.c:        ^
smult.c: smult.c:40:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:41:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
avx2intclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

Compiler output


smult.c: smult.c: In function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_from_gfe':
smult.c: smult.c:14:6: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
smult.c:    14 |     0[(crypto_uint64 *) &y->v[i]] = x[0].v[i];
smult.c:       |      ^
smult.c: smult.c: In function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_to_gfe':
smult.c: smult.c:26:18: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
smult.c:    26 |     x[0].v[i] = 0[(crypto_uint64 *) &y->v[i]];
smult.c:       |                  ^
smult.c: smult.c: In function 'crypto_scalarmult_kummer_avx2int_constbranchindex':
smult.c: smult.c:162:6: warning: 'input.v[0]' is used uninitialized in this function [-Wuninitialized]
smult.c:   162 |   a0 = a->v[0];
smult.c:       |   ~~~^~~~~~~~~
smult.c: smult.c:174:6: warning: 'input.v[1]' is used uninitialized in this function [-Wuninitialized]
smult.c:   174 |   a1 = a->v[1];
smult.c:       |   ~~~^~~~~~~~~
smult.c: smult.c:187:6: warning: 'input.v[2]' is used uninitialized in this function [-Wuninitialized]
smult.c:   187 |   a2 = a->v[2];
smult.c:       |   ~~~^~~~~~~~~
smult.c: smult.c:200:6: warning: 'input.v[3]' is used uninitialized in this function [-Wuninitialized]
smult.c:   200 |   a3 = a->v[3];
smult.c:       |   ~~~^~~~~~~~~
smult.c: smult.c:213:6: warning: 'input.v[4]' is used uninitialized in this function [-Wuninitialized]
smult.c:   213 |   a4 = a->v[4];
smult.c:       |   ~~~^~~~~~~~~

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
avx2intgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

Compiler output


smult.c: smult.c: In function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_from_gfe':
smult.c: smult.c:14:6: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
smult.c:    14 |     0[(crypto_uint64 *) &y->v[i]] = x[0].v[i];
smult.c:       |      ^
smult.c: smult.c: In function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_to_gfe':
smult.c: smult.c:26:18: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
smult.c:    26 |     x[0].v[i] = 0[(crypto_uint64 *) &y->v[i]];
smult.c:       |                  ^

Number of similar (implementation,compiler) pairs: 2, namely:
ImplementationCompiler
avx2intgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2intgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

Passed TIMECOP


TIMECOP iterations: 10

Number of similar (implementation,compiler) pairs: 43, namely:
ImplementationCompiler
avxclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avxclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avxclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avxclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avxclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avxgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avxgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avxgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avxgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2intclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2intclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2intclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2intclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx2intgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2intgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx2intgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
ref5clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
ref5gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
ref5gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
ref5gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
ref5uclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5uclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5uclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5uclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5uclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
ref5ugcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
ref5ugcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
ref5ugcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
ref5ugcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)