Implementation notes: amd64, speed2supercop, crypto_kem/bikel1

Computer: speed2supercop
Microarchitecture: amd64; Haswell+AES (306c3)
Architecture: amd64
CPU ID: GenuineIntel-000306c3-1fc9cbf5
SUPERCOP version: 20240625
Operation: crypto_kem
Primitive: bikel1
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
1914760192589 72 4214853 880 1572T:ches2021clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
193509688924 72 4110821 880 1572T:ches2021clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
213366484287 72 4105244 840 1604T:ches2021gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
237203248717 72 467415 872 1636T:ches2021clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
242590854985 72 473125 880 1572T:ches2021clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
244112057646 72 476628 840 1604T:ches2021gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
251511254680 72 473372 840 1604T:ches2021gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
263211238417 64 460853 872 1572T:avx2clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
266092829760 64 451789 872 1572T:avx2clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
279022043152 72 460868 832 1572T:ches2021gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
292855645337 64 466117 832 1604T:avx2gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
332900050842 56 473197 864 1572T:aes-ni-and-pclmulclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
430839216731 64 435663 864 1636T:avx2clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
551260419071 64 437373 872 1572T:avx2clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
557776024476 64 442957 832 1604T:avx2gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
570414024714 64 443548 832 1604T:avx2gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
580384021991 64 439460 824 1572T:avx2gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
604359229931 56 452141 864 1572T:aes-ni-and-pclmulclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
711712462195 56 482933 824 1604T:aes-ni-and-pclmulgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
844606815785 56 434631 856 1636T:aes-ni-and-pclmulclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
975990432490 56 451285 824 1604T:aes-ni-and-pclmulgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
983399617683 56 436005 864 1572T:aes-ni-and-pclmulclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
985173221041 56 438468 816 1572T:aes-ni-and-pclmulgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1008157631427 56 449877 824 1604T:aes-ni-and-pclmulgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1142544047689 48 470639 912 1572T:portableclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1147051250759 56 473109 864 1572T:aes-ni-onlyclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1318479628276 48 449863 912 1572T:portableclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1417217626786 48 449575 912 1572T:portableclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1432014029848 56 452053 864 1572T:aes-ni-onlyclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1491718061455 48 482846 888 1604T:portablegcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1494497265106 56 485893 824 1604T:aes-ni-onlygcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1688494415351 56 434183 856 1636T:aes-ni-onlyclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1690031613023 48 432465 904 1636T:portableclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1870480017101 56 435445 864 1572T:aes-ni-onlyclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1883755230754 48 450158 888 1604T:portablegcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1885248014786 48 433727 912 1572T:portableclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1898361620685 56 438076 816 1572T:aes-ni-onlygcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1917604833146 56 451957 824 1604T:aes-ni-onlygcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1953018418514 48 436541 880 1572T:portablegcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1962480432323 56 450749 824 1604T:aes-ni-onlygcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625
1978291230000 48 449070 888 1604T:portablegcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071120240625

Compiler output


aes.c: aes.c:9:4: error: "This code requries support for AES_NI and SSSE3"
aes.c: #  error "This code requries support for AES_NI and SSSE3"
aes.c:    ^
aes.c: 1 error generated.

Number of similar (implementation,compiler) pairs: 6, namely:
ImplementationCompiler
T:aes-ni-and-pclmulclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:aes-ni-onlyclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:avx2clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:avx512clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:avx512-vpclmulclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:ches2021clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))

Compiler output


decode.c: In file included from decode.c:39:
decode.c: ./gf2x.h:22:10: error: always_inline function '_mm512_loadu_si512' requires target feature 'avx512f', but would be inlined into function 'gf2x_mod_add' that is compiled without support for 'avx512f'
decode.c:     va = LOAD(&a_qwords[i]);
decode.c:          ^
decode.c: ./x86_64_intrinsic.h:40:27: note: expanded from macro 'LOAD'
decode.c: #  define LOAD(mem)       _mm512_loadu_si512((mem))
decode.c:                           ^
decode.c: In file included from decode.c:39:
decode.c: ./gf2x.h:22:10: error: AVX vector return of type '__m512i' (vector of 8 'long long' values) without 'avx512f' enabled changes the ABI
decode.c: ./x86_64_intrinsic.h:40:27: note: expanded from macro 'LOAD'
decode.c: #  define LOAD(mem)       _mm512_loadu_si512((mem))
decode.c:                           ^
decode.c: In file included from decode.c:39:
decode.c: ./gf2x.h:23:10: error: always_inline function '_mm512_loadu_si512' requires target feature 'avx512f', but would be inlined into function 'gf2x_mod_add' that is compiled without support for 'avx512f'
decode.c:     vb = LOAD(&b_qwords[i]);
decode.c:          ^
decode.c: ./x86_64_intrinsic.h:40:27: note: expanded from macro 'LOAD'
decode.c: #  define LOAD(mem)       _mm512_loadu_si512((mem))
decode.c:                           ^
decode.c: In file included from decode.c:39:
decode.c: ./gf2x.h:23:10: error: AVX vector return of type '__m512i' (vector of 8 'long long' values) without 'avx512f' enabled changes the ABI
decode.c: ./x86_64_intrinsic.h:40:27: note: expanded from macro 'LOAD'
decode.c: #  define LOAD(mem)       _mm512_loadu_si512((mem))
decode.c:                           ^
decode.c: In file included from decode.c:39:
decode.c: ...

Number of similar (implementation,compiler) pairs: 8, namely:
ImplementationCompiler
T:avx512clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:avx512clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:avx512clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:avx512clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:avx512-vpclmulclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:avx512-vpclmulclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:avx512-vpclmulclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:avx512-vpclmulclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))

Compiler output


decode.c: In file included from decode.c:39:
decode.c: gf2x.h: In function 'gf2x_mod_add':
decode.c: gf2x.h:22:8: warning: AVX512F vector return without AVX512F enabled changes the ABI [-Wpsabi]
decode.c:    22 |     va = LOAD(&a_qwords[i]);
decode.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/13/include/immintrin.h:53,
decode.c:                  from x86_64_intrinsic.h:20,
decode.c:                  from defs.h:103,
decode.c:                  from bike_defs.h:10,
decode.c:                  from types.h:13,
decode.c:                  from decode.h:10,
decode.c:                  from decode.c:37:
decode.c: /usr/lib/gcc/x86_64-linux-gnu/13/include/avx512fintrin.h:6532:1: error: inlining failed in call to 'always_inline' '_mm512_storeu_si512': target specific option mismatch
decode.c:  6532 | _mm512_storeu_si512 (void *__P, __m512i __A)
decode.c:       | ^~~~~~~~~~~~~~~~~~~
decode.c: x86_64_intrinsic.h:41:27: note: called from here
decode.c:    41 | #  define STORE(mem, reg) _mm512_storeu_si512((mem), (reg))
decode.c:       |                           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
decode.c: gf2x.h:25:5: note: in expansion of macro 'STORE'
decode.c:    25 |     STORE(&c_qwords[i], va ^ vb);
decode.c:       |     ^~~~~
decode.c: /usr/lib/gcc/x86_64-linux-gnu/13/include/avx512fintrin.h:6499:1: error: inlining failed in call to 'always_inline' '_mm512_loadu_si512': target specific option mismatch
decode.c:  6499 | _mm512_loadu_si512 (void const *__P)
decode.c:       | ^~~~~~~~~~~~~~~~~~
decode.c: x86_64_intrinsic.h:40:27: note: called from here
decode.c:    40 | #  define LOAD(mem)       _mm512_loadu_si512((mem))
decode.c: ...

Number of similar (implementation,compiler) pairs: 8, namely:
ImplementationCompiler
T:avx512gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)
T:avx512gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)
T:avx512gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)
T:avx512gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)
T:avx512-vpclmulgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)
T:avx512-vpclmulgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)
T:avx512-vpclmulgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)
T:avx512-vpclmulgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)

Compiler output


decode.c: decode.c:211:15: warning: unused function 'adder_size_53' [-Wunused-function]
decode.c: _INLINE_ void adder_size_53(OUT upc_t *upc,
decode.c:               ^
decode.c: decode.c:246:15: warning: unused function 'bit_sliced_adder_test' [-Wunused-function]
decode.c: _INLINE_ void bit_sliced_adder_test(OUT upc_t *upc,
decode.c:               ^
decode.c: 2 warnings generated.
gf2x_mul.c: gf2x_mul.c:116:15: warning: function 'karatzuba' is not needed and will not be emitted [-Wunneeded-internal-declaration]
gf2x_mul.c: _INLINE_ void karatzuba(OUT uint64_t *c,
gf2x_mul.c:               ^
gf2x_mul.c: 1 warning generated.
rkara3_mul_avx2.c: rkara3_mul_avx2.c:11:9: warning: unused function 'msbyte' [-Wunused-function]
rkara3_mul_avx2.c: __m256i msbyte( __m256i a ) { return _mm256_permute4x64_epi64(_mm256_srli_si256(a,15),0xfe); } // 11,11,11,10
rkara3_mul_avx2.c:         ^
rkara3_mul_avx2.c: rkara3_mul_avx2.c:169:6: warning: unused function 'mul_2bits_test' [-Wunused-function]
rkara3_mul_avx2.c: void mul_2bits_test( uint8_t *c , const uint8_t *a , uint8_t b , int len )
rkara3_mul_avx2.c:      ^
rkara3_mul_avx2.c: 2 warnings generated.

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:ches2021clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:ches2021clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:ches2021clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:ches2021clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))

Compiler output


rkara3_mul_avx2.c: rkara3_mul_avx2.c:169:6: warning: 'mul_2bits_test' defined but not used [-Wunused-function]
rkara3_mul_avx2.c:   169 | void mul_2bits_test( uint8_t *c , const uint8_t *a , uint8_t b , int len )
rkara3_mul_avx2.c:       |      ^~~~~~~~~~~~~~

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:ches2021gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)
T:ches2021gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)
T:ches2021gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)
T:ches2021gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)