Implementation notes: amd64, comet, crypto_scalarmult/kummer

Computer: comet
Microarchitecture: amd64; Comet Lake (806ec)
Architecture: amd64
CPU ID: GenuineIntel-000806ec-bfebfbff
SUPERCOP version: 20240107
Operation: crypto_scalarmult
Primitive: kummer
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
21627721269 0 045857 860 1792avx2intclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
21635815058 0 038620 788 1792avx2intgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
22633318821 0 043113 860 1760avx2intclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
22846917423 0 038073 860 1728avx2intclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
2304198493 0 029975 852 1792avx2intclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
2448118791 0 029540 788 1792avx2intgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
2473719403 0 031084 788 1792avx2intgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
2601286160 0 025692 780 1760avx2intgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
2620338780 0 032340 788 1792avx2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
2628628589 0 029364 788 1792avx2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
2631128428 0 027940 780 1760avx2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
2631208556 0 030196 788 1792avx2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
2636668891 0 031905 860 1728avx2clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
2645039083 0 033881 860 1792avx2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
2645609083 0 033569 860 1760avx2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
2650768449 0 030023 852 1792avx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
2651078587 0 029337 860 1728avx2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
3523269466 0 033012 788 1792avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
3530109242 0 030868 788 1792avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
3533899275 0 030036 788 1792avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
3533959577 0 032545 860 1728avxclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
3536119114 0 028612 780 1760avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
3540579769 0 034241 860 1760avxclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
3544529769 0 034553 860 1792avxclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
3546589273 0 030009 860 1728avxclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
3546789135 0 030695 852 1792avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
82833213746 0 038457 860 1792ref5uclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
8337606166 0 029628 788 1792ref5gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
8610104291 0 025783 852 1792ref5uclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
86590812215 0 036633 860 1760ref5uclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
8692674461 0 026036 788 1792ref5gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
87165313943 0 036857 860 1728ref5clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
87447413716 0 038425 860 1792ref5clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
89675214185 0 037073 860 1728ref5uclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
90482612045 0 036409 860 1760ref5clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
91732712243 0 032881 860 1728ref5clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
9251104712 0 026340 788 1792ref5ugcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
9291314341 0 025839 852 1792ref5clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
9727016601 0 030132 788 1792ref5ugcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
98221412477 0 033113 860 1728ref5uclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010620231222
9931944113 0 023540 780 1760ref5gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
10156174422 0 025116 788 1792ref5gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
10541494726 0 025380 788 1792ref5ugcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222
10587124417 0 023868 780 1760ref5ugcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010620231222

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: /usr/bin/ld: warning: crypto_stream_chacha20_moon_avx2_64_constbranchindex-chacha.o: missing .note.GNU-stack section implies executable stack
try.c: /usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
try.c: /usr/bin/ld: warning: crypto_stream_chacha20_moon_avx2_64_constbranchindex-chacha.o: missing .note.GNU-stack section implies executable stack
try.c: /usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
measure.c: /usr/bin/ld: warning: ladder.o: missing .note.GNU-stack section implies executable stack
measure.c: /usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker

Number of similar (compiler,implementation) pairs: 18, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE avx
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE avx
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2

Compiler output

Implementation: avx2int
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: /usr/bin/ld: warning: crypto_stream_chacha20_moon_avx2_64_constbranchindex-chacha.o: missing .note.GNU-stack section implies executable stack
try.c: /usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
try.c: /usr/bin/ld: warning: crypto_stream_chacha20_moon_avx2_64_constbranchindex-chacha.o: missing .note.GNU-stack section implies executable stack
try.c: /usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker

Number of similar (compiler,implementation) pairs: 26, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2int
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2int
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2int
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2int
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2int
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2int
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2int
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2int
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref5
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref5
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref5
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref5
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref5
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE ref5
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE ref5
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE ref5
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE ref5
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref5u
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref5u
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref5u
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref5u
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref5u
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE ref5u
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE ref5u
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE ref5u
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE ref5u

Compiler output

Implementation: avx2int
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
smult.c: smult.c:36:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t0 = _mm256_mul_epi32(a->v[0],*b);
smult.c: ^
smult.c: smult.c:36:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:37:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t1 = _mm256_mul_epi32(a->v[1],*b);
smult.c: ^
smult.c: smult.c:37:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:38:30: error: always_inline function '_mm256_srli_epi64' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t1 = _mm256_add_epi64(t1,_mm256_srli_epi64(t0,26)); t0 &= mask26;
smult.c: ^
smult.c: smult.c:38:30: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:38:10: error: always_inline function '_mm256_add_epi64' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t1 = _mm256_add_epi64(t1,_mm256_srli_epi64(t0,26)); t0 &= mask26;
smult.c: ^
smult.c: smult.c:38:10: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:39:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t2 = _mm256_mul_epi32(a->v[2],*b);
smult.c: ^
smult.c: smult.c:39:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:40:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t3 = _mm256_mul_epi32(a->v[3],*b);
smult.c: ^
smult.c: smult.c:40:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:41:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2int