Implementation notes: amd64, hydra7, crypto_kem/bikel3

Computer: hydra7
Microarchitecture: amd64; Sandy Bridge+AES (206a7)
Architecture: amd64
CPU ID: GenuineIntel-000206a7-bfebfbff
SUPERCOP version: 20240625
Operation: crypto_kem
Primitive: bikel3
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
3287670886676 56 4105157 876 1764T:aes-ni-and-pclmulgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
4038920443932 56 461109 876 1764T:aes-ni-and-pclmulgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
4069141043078 56 459813 876 1764T:aes-ni-and-pclmulgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
4425022430897 56 446701 868 1732T:aes-ni-and-pclmulgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
6108868898019 48 4117158 940 1764T:portablegcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
61205461101884 56 4120333 876 1764T:aes-ni-onlygcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
9893560540653 48 458494 940 1764T:portablegcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
9904778343500 56 460661 876 1764T:aes-ni-onlygcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
9945007928274 48 444718 932 1732T:portablegcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
10118150440044 48 457534 940 1764T:portablegcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
10212401642391 56 459173 876 1764T:aes-ni-onlygcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
10301124130627 56 446349 868 1732T:aes-ni-onlygcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625

Compiler output


gf2x_ksqr_avx2.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
gf2x_ksqr_avx2.c:                  from x86_64_intrinsic.h:20,
gf2x_ksqr_avx2.c:                  from defs.h:103,
gf2x_ksqr_avx2.c:                  from bike_defs.h:10,
gf2x_ksqr_avx2.c:                  from types.h:13,
gf2x_ksqr_avx2.c:                  from utilities.h:13,
gf2x_ksqr_avx2.c:                  from cleanup.h:10,
gf2x_ksqr_avx2.c:                  from gf2x_ksqr_avx2.c:13:
gf2x_ksqr_avx2.c: gf2x_ksqr_avx2.c: In function 'bytes_to_bin':
gf2x_ksqr_avx2.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:433:1: error: inlining failed in call to 'always_inline' '_mm256_movemask_epi8': target specific option mismatch
gf2x_ksqr_avx2.c:   433 | _mm256_movemask_epi8 (__m256i __A)
gf2x_ksqr_avx2.c:       | ^~~~~~~~~~~~~~~~~~~~
gf2x_ksqr_avx2.c: In file included from defs.h:103,
gf2x_ksqr_avx2.c:                  from bike_defs.h:10,
gf2x_ksqr_avx2.c:                  from types.h:13,
gf2x_ksqr_avx2.c:                  from utilities.h:13,
gf2x_ksqr_avx2.c:                  from cleanup.h:10,
gf2x_ksqr_avx2.c:                  from gf2x_ksqr_avx2.c:13:
gf2x_ksqr_avx2.c: x86_64_intrinsic.h:79:23: note: called from here
gf2x_ksqr_avx2.c:    79 | #  define MOVEMASK(a) _mm256_movemask_epi8(a)
gf2x_ksqr_avx2.c:       |                       ^~~~~~~~~~~~~~~~~~~~~~~
gf2x_ksqr_avx2.c: gf2x_ksqr_avx2.c:85:17: note: in expansion of macro 'MOVEMASK'
gf2x_ksqr_avx2.c:    85 |     bin32[i]  = MOVEMASK(t);
gf2x_ksqr_avx2.c:       |                 ^~~~~~~~
gf2x_ksqr_avx2.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
gf2x_ksqr_avx2.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:avx2gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Compiler output


decode.c: In file included from decode.c:39:
decode.c: gf2x.h: In function 'gf2x_mod_add':
decode.c: gf2x.h:22:8: warning: AVX512F vector return without AVX512F enabled changes the ABI [-Wpsabi]
decode.c:    22 |     va = LOAD(&a_qwords[i]);
decode.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:49,
decode.c:                  from x86_64_intrinsic.h:20,
decode.c:                  from defs.h:103,
decode.c:                  from bike_defs.h:10,
decode.c:                  from types.h:13,
decode.c:                  from decode.h:10,
decode.c:                  from decode.c:37:
decode.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx512fintrin.h:6481:1: error: inlining failed in call to 'always_inline' '_mm512_storeu_si512': target specific option mismatch
decode.c:  6481 | _mm512_storeu_si512 (void *__P, __m512i __A)
decode.c:       | ^~~~~~~~~~~~~~~~~~~
decode.c: In file included from defs.h:103,
decode.c:                  from bike_defs.h:10,
decode.c:                  from types.h:13,
decode.c:                  from decode.h:10,
decode.c:                  from decode.c:37:
decode.c: x86_64_intrinsic.h:41:27: note: called from here
decode.c:    41 | #  define STORE(mem, reg) _mm512_storeu_si512((mem), (reg))
decode.c:       |                           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
decode.c: gf2x.h:25:5: note: in expansion of macro 'STORE'
decode.c:    25 |     STORE(&c_qwords[i], va ^ vb);
decode.c:       |     ^~~~~
decode.c: ...

Number of similar (implementation,compiler) pairs: 8, namely:
ImplementationCompiler
T:avx512gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx512gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx512gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx512gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx512-vpclmulgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx512-vpclmulgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx512-vpclmulgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx512-vpclmulgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Compiler output


decode.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
decode.c:                  from x86_64_intrinsic.h:20,
decode.c:                  from defs.h:106,
decode.c:                  from bike_defs.h:10,
decode.c:                  from types.h:15,
decode.c:                  from decode.h:17,
decode.c:                  from decode.c:39:
decode.c: decode.c: In function 'find_err1':
decode.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:1084:1: error: inlining failed in call to 'always_inline' '_mm256_permute2x128_si256': target specific option mismatch
decode.c:  1084 | _mm256_permute2x128_si256 (__m256i __X, __m256i __Y, const int __M)
decode.c:       | ^~~~~~~~~~~~~~~~~~~~~~~~~
decode.c: decode.c:359:29: note: called from here
decode.c:   359 |                         y = _mm256_permute2x128_si256(buf[j], buf[j+128], 0x31);
decode.c:       |                             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
decode.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
decode.c:                  from x86_64_intrinsic.h:20,
decode.c:                  from defs.h:106,
decode.c:                  from bike_defs.h:10,
decode.c:                  from types.h:15,
decode.c:                  from decode.h:17,
decode.c:                  from decode.c:39:
decode.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:1084:1: error: inlining failed in call to 'always_inline' '_mm256_permute2x128_si256': target specific option mismatch
decode.c:  1084 | _mm256_permute2x128_si256 (__m256i __X, __m256i __Y, const int __M)
decode.c:       | ^~~~~~~~~~~~~~~~~~~~~~~~~
decode.c: decode.c:358:29: note: called from here
decode.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:ches2021gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:ches2021gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:ches2021gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:ches2021gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)