Implementation notes: amd64, rumba7, crypto_kem/bikel1

Computer: rumba7
Microarchitecture: amd64; Zen (800f11)
Architecture: amd64
CPU ID: AuthenticAMD-00800f11-178bfbff
SUPERCOP version: 20240107
Operation: crypto_kem
Primitive: bikel1
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
299640893624 72 4114725 892 1764T:ches2021gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
3012899172893 72 4195363 932 1732T:ches2021clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
302837867352 72 489211 932 1732T:ches2021clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
326129462642 72 482565 892 1764T:ches2021gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
332226459759 72 478989 892 1764T:ches2021gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
344251954815 72 473739 932 1732T:ches2021clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
350353847774 72 467077 924 1796T:ches2021clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
369539944558 72 462757 884 1732T:ches2021gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
406963023696 64 445619 924 1732T:avx2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
408154429990 64 452531 924 1732T:avx2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
417717641122 64 462245 884 1764T:avx2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
502288443999 56 466419 916 1732T:aes-ni-and-pclmulclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
575839718964 64 437987 924 1732T:avx2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
586694526657 64 445845 884 1764T:avx2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
624074927358 64 447237 884 1764T:avx2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
681849723829 64 442005 876 1732T:avx2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
741502316441 64 435933 916 1796T:avx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
758716766281 56 487301 876 1764T:aes-ni-and-pclmulgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
770848124726 56 446763 916 1732T:aes-ni-and-pclmulclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
935596717652 56 436619 916 1732T:aes-ni-and-pclmulclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
971982933208 56 452269 876 1764T:aes-ni-and-pclmulgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
980158122724 56 440813 868 1732T:aes-ni-and-pclmulgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
983264934290 56 454013 876 1764T:aes-ni-and-pclmulgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
1148208115592 56 434997 908 1796T:aes-ni-and-pclmulclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
1223678640987 48 463997 964 1732T:portableclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
1225190744057 56 466467 916 1732T:aes-ni-onlyclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
1299849530734 48 452781 964 1732T:portableclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
1490194624784 56 446875 916 1732T:aes-ni-onlyclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
1491896963537 56 484581 876 1764T:aes-ni-onlygcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
1494374059807 48 481510 940 1764T:portablegcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
1554469122293 48 444949 964 1732T:portableclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
1811504122404 56 440381 868 1732T:aes-ni-onlygcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
1818529020061 48 438750 932 1732T:portablegcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
1824894414771 48 434341 964 1732T:portableclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
1838363817118 56 436123 916 1732T:aes-ni-onlyclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
1866751835170 56 454885 876 1764T:aes-ni-onlygcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
1866783832357 48 452766 940 1764T:portablegcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
1966709831682 48 451422 940 1764T:portablegcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
1970440434068 56 453141 876 1764T:aes-ni-onlygcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121420231212
1988124515195 56 434581 908 1796T:aes-ni-onlyclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212
2005396213055 48 433023 956 1796T:portableclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121420231212

Compiler output

Implementation: T:aes-ni-and-pclmul
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
aes.c: aes.c:9:4: error: "This code requries support for AES_NI and SSSE3"
aes.c: # error "This code requries support for AES_NI and SSSE3"
aes.c: ^
aes.c: 1 error generated.

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:aes-ni-and-pclmul T:aes-ni-only T:avx2 T:avx512 T:avx512-vpclmul T:ches2021

Compiler output

Implementation: T:avx512
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
decode.c: In file included from decode.c:39:
decode.c: ./gf2x.h:22:10: error: always_inline function '_mm512_loadu_si512' requires target feature 'avx512f', but would be inlined into function 'gf2x_mod_add' that is compiled without support for 'avx512f'
decode.c: va = LOAD(&a_qwords[i]);
decode.c: ^
decode.c: ./x86_64_intrinsic.h:40:27: note: expanded from macro 'LOAD'
decode.c: # define LOAD(mem) _mm512_loadu_si512((mem))
decode.c: ^
decode.c: In file included from decode.c:39:
decode.c: ./gf2x.h:22:10: error: AVX vector return of type '__m512i' (vector of 8 'long long' values) without 'avx512f' enabled changes the ABI
decode.c: ./x86_64_intrinsic.h:40:27: note: expanded from macro 'LOAD'
decode.c: # define LOAD(mem) _mm512_loadu_si512((mem))
decode.c: ^
decode.c: In file included from decode.c:39:
decode.c: ./gf2x.h:23:10: error: always_inline function '_mm512_loadu_si512' requires target feature 'avx512f', but would be inlined into function 'gf2x_mod_add' that is compiled without support for 'avx512f'
decode.c: vb = LOAD(&b_qwords[i]);
decode.c: ^
decode.c: ./x86_64_intrinsic.h:40:27: note: expanded from macro 'LOAD'
decode.c: # define LOAD(mem) _mm512_loadu_si512((mem))
decode.c: ^
decode.c: In file included from decode.c:39:
decode.c: ./gf2x.h:23:10: error: AVX vector return of type '__m512i' (vector of 8 'long long' values) without 'avx512f' enabled changes the ABI
decode.c: ./x86_64_intrinsic.h:40:27: note: expanded from macro 'LOAD'
decode.c: # define LOAD(mem) _mm512_loadu_si512((mem))
decode.c: ^
decode.c: In file included from decode.c:39:
decode.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512-vpclmul
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512-vpclmul
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512-vpclmul
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512-vpclmul

Compiler output

Implementation: T:avx512
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
decode.c: In file included from decode.c:39:
decode.c: gf2x.h: In function 'gf2x_mod_add':
decode.c: gf2x.h:22:8: warning: AVX512F vector return without AVX512F enabled changes the ABI [-Wpsabi]
decode.c: 22 | va = LOAD(&a_qwords[i]);
decode.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:49,
decode.c: from x86_64_intrinsic.h:20,
decode.c: from defs.h:103,
decode.c: from bike_defs.h:10,
decode.c: from types.h:13,
decode.c: from decode.h:10,
decode.c: from decode.c:37:
decode.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx512fintrin.h:6481:1: error: inlining failed in call to 'always_inline' '_mm512_storeu_si512': target specific option mismatch
decode.c: 6481 | _mm512_storeu_si512 (void *__P, __m512i __A)
decode.c: | ^~~~~~~~~~~~~~~~~~~
decode.c: In file included from defs.h:103,
decode.c: from bike_defs.h:10,
decode.c: from types.h:13,
decode.c: from decode.h:10,
decode.c: from decode.c:37:
decode.c: x86_64_intrinsic.h:41:27: note: called from here
decode.c: 41 | # define STORE(mem, reg) _mm512_storeu_si512((mem), (reg))
decode.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
decode.c: gf2x.h:25:5: note: in expansion of macro 'STORE'
decode.c: 25 | STORE(&c_qwords[i], va ^ vb);
decode.c: | ^~~~~
decode.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx512
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx512
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx512
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx512
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx512-vpclmul
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx512-vpclmul
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx512-vpclmul
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx512-vpclmul