Implementation notes: amd64, rumba7, crypto_scalarmult/kummer

Computer: rumba7
Microarchitecture: amd64; Zen (800f11)
Architecture: amd64
CPU ID: AuthenticAMD-00800f11-178bfbff
SUPERCOP version: 20240107
Operation: crypto_scalarmult
Primitive: kummer
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
40726715620 0 039237 812 1792avx2intgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
40739919078 0 042715 852 1760avx2intclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
40886621605 0 045587 852 1760avx2intclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
4415518891 0 031939 852 1728avx2clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
4418379155 0 032693 812 1792avx2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
442529? ? ?? ? ?avx2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
4431168792 0 030045 812 1792avx2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
4431348763 0 028813 804 1760avx2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
4634368596 0 029877 812 1792avx2intgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
4641698946 0 031309 812 1792avx2intgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
4682329577 0 032579 852 1728avxclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
4683729705 0 033883 852 1760avxclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
4683799705 0 033507 852 1760avxclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
4685569841 0 033365 812 1792avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
4691409273 0 030043 852 1728avxclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
4693099545 0 031869 812 1792avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
4693759135 0 030613 844 1792avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
4697769478 0 030717 812 1792avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
4697939449 0 029485 804 1760avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
4737318601 0 030013 844 1792avx2intclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
4870176527 0 026629 804 1760avx2intgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
49107517503 0 038155 852 1728avx2intclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
9213647106 0 030613 812 1792ref5ugcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
97531514074 0 038179 852 1760ref5uclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
9830746751 0 030197 812 1792ref5gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
98604412447 0 036163 852 1760ref5uclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
10136954368 0 025821 844 1792ref5uclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
102681413884 0 037971 852 1760ref5clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
103690212069 0 035827 852 1760ref5clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
10394324691 0 026949 812 1792ref5gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
10454814982 0 027277 812 1792ref5ugcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
104949613943 0 036891 852 1728ref5clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
105376814185 0 037107 852 1728ref5uclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
106604812443 0 033083 852 1728ref5clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
10814764221 0 025645 844 1792ref5clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
108400012581 0 033251 852 1728ref5uclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
11597534779 0 025925 812 1792ref5gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
11669905089 0 026253 812 1792ref5ugcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
12338944734 0 024749 804 1760ref5ugcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
12489704448 0 024429 804 1760ref5gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212

Compiler output

Implementation: avx2
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: /usr/bin/ld: /home/supercop/benchmarking/supercop-20231212/supercop-data/rumba7/amd64/lib/libsupercop.a(crypto_stream_chacha20_moon_avx2_64_constbranchindex-crypto_stream.o): bad reloc symbol index (0xdbfeddc5 >= 0x6) for offset 0x602464ef75c5ffef in section `.text'
try.c: /usr/bin/ld: failed to set dynamic section sizes: file truncated
try.c: clang: error: linker command failed with exit code 1 (use -v to see invocation)

Number of similar (compiler,implementation) pairs: 2, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2

Compiler output

Implementation: avx2
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: /usr/bin/ld: /home/supercop/benchmarking/supercop-20231212/supercop-data/rumba7/amd64/lib/libsupercop.a(crypto_stream_chacha20_moon_avx2_64_constbranchindex-crypto_stream.o): bad reloc symbol index (0x65c1c4d2 >= 0x6) for offset 0xc9fe75c1c4c0fe7d in section `.text'
try.c: /usr/bin/ld: failed to set dynamic section sizes: file truncated
try.c: clang: error: linker command failed with exit code 1 (use -v to see invocation)

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2

Compiler output

Implementation: avx2
Security model: constbranchindex
Compiler: clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: /usr/bin/ld: /home/supercop/benchmarking/supercop-20231212/supercop-data/rumba7/amd64/lib/libsupercop.a(crypto_stream_chacha20_moon_avx2_64_constbranchindex-crypto_stream.o): bad reloc symbol index (0x9dc514d4 >= 0x6) for offset 0xf4729dc5edefb5c5 in section `.text'
try.c: /usr/bin/ld: failed to set dynamic section sizes: file truncated
try.c: clang: error: linker command failed with exit code 1 (use -v to see invocation)

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2

Compiler output

Implementation: avx2
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
measure.c: /usr/bin/ld: can not read symbols: file truncated
measure.c: /usr/bin/ld: .eh_frame/.stab edit: file truncated
measure.c: /usr/bin/ld: measure: warning: allocated section `.interp' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.note.gnu.property' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.note.gnu.build-id' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.note.ABI-tag' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.gnu.hash' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.dynsym' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.dynstr' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.gnu.version' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.gnu.version_r' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.rela.dyn' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.rela.plt' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.init' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.plt' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.plt.got' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.text' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.fini' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.rodata' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.eh_frame' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.init_array' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.fini_array' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.dynamic' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.got' not in segment
measure.c: /usr/bin/ld: measure: warning: allocated section `.data' not in segment
measure.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2

Compiler output

Implementation: avx2int
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
smult.c: smult.c:36:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t0 = _mm256_mul_epi32(a->v[0],*b);
smult.c: ^
smult.c: smult.c:36:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:37:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t1 = _mm256_mul_epi32(a->v[1],*b);
smult.c: ^
smult.c: smult.c:37:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:38:30: error: always_inline function '_mm256_srli_epi64' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t1 = _mm256_add_epi64(t1,_mm256_srli_epi64(t0,26)); t0 &= mask26;
smult.c: ^
smult.c: smult.c:38:30: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:38:10: error: always_inline function '_mm256_add_epi64' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t1 = _mm256_add_epi64(t1,_mm256_srli_epi64(t0,26)); t0 &= mask26;
smult.c: ^
smult.c: smult.c:38:10: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:39:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t2 = _mm256_mul_epi32(a->v[2],*b);
smult.c: ^
smult.c: smult.c:39:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:40:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t3 = _mm256_mul_epi32(a->v[3],*b);
smult.c: ^
smult.c: smult.c:40:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:41:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2int