Implementation notes: amd64, whosthere, crypto_kem/ntskem1264

Computer: whosthere
Microarchitecture: amd64; KabyLake (806e9)
Architecture: amd64
CPU ID: GenuineIntel-000806e9-bfebfbff
SUPERCOP version: 20221122
Operation: crypto_kem
Primitive: ntskem1264
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
41310296004 6228 16115882 7092 1720T:avx2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
416010152609 6228 16171850 7092 1720T:avx2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
463912102296 6228 16122522 7076 1784T:avx2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101920221005
48607867087 6228 1684344 7084 1720T:avx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
49225766075 6228 1684610 7076 1784T:avx2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101920221005
50377364293 6228 1682346 7076 1784T:avx2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101920221005
507954146668 6228 16166018 7092 1720T:sse2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
51478590666 6228 16110650 7092 1720T:sse2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
54866496231 6228 16116490 7076 1784T:sse2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101920221005
57540760483 6228 1678978 7076 1784T:sse2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101920221005
58752958678 6228 1676770 7076 1784T:sse2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101920221005
62592762506 6228 1679776 7084 1720T:sse2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
64504059879 6228 1676905 7068 1752T:avx2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101920221005
711911128976 6228 16148786 7092 1720T:sse2clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
722856126667 6228 16147538 7092 1720T:optclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
73322878680 6228 1696440 7084 1720T:avx2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
73511193995 6228 16115914 7076 1784T:optgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101920221005
74689788455 6228 16109898 7092 1720T:optclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
77287454361 6228 1674562 7076 1784T:optgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101920221005
79055557447 6228 1676168 7084 1720T:optclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
79351554746 6228 1671865 7068 1752T:sse2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101920221005
86840752539 6228 1672234 7076 1784T:optgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101920221005
87739474875 6228 1692696 7084 1720T:sse2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
88314348961 6228 1667481 7068 1752T:optgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101920221005
937657121580 6228 16142866 7092 1720T:optclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
136056070028 6228 1689392 7084 1720T:optclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
2244270644922 76 1666858 932 1720T:refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
2296987141366 76 1662922 932 1720T:refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
2310737649646 76 1671610 932 1720T:refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
2322259353557 76 1675658 900 1784T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101920221005
2354598425121 76 1645338 900 1784T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101920221005
2380623321995 76 1640872 924 1720T:refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
2490962019091 76 1637561 892 1752T:refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101920221005
2563762024443 76 1643832 924 1720T:refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101920221005
2618153222602 76 1642258 900 1784T:refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101920221005

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
bitslice_fft_256.c: bitslice_fft_256.c:87:25: error: always_inline function '_mm256_set_epi64x' requires target feature 'avx', but would be inlined into function 'bitslice_butterflies12_256' that is compiled without support for 'avx'
bitslice_fft_256.c: out[i][b] = _mm256_set_epi64x(-((in[0][b] >> reversal[4*i+3]) & 1),
bitslice_fft_256.c: ^
bitslice_fft_256.c: bitslice_fft_256.c:87:25: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
bitslice_fft_256.c: bitslice_fft_256.c:99:22: error: '__builtin_ia32_pshufd256' needs target feature avx2
bitslice_fft_256.c: vb = _mm256_shuffle_epi32(tmp[b], _MM_SHUFFLE(3, 2, 3, 2));
bitslice_fft_256.c: ^
bitslice_fft_256.c: /usr/lib/llvm-11/lib/clang/11.0.1/include/avx2intrin.h:470:12: note: expanded from macro '_mm256_shuffle_epi32'
bitslice_fft_256.c: (__m256i)__builtin_ia32_pshufd256((__v8si)(__m256i)(a), (int)(imm))
bitslice_fft_256.c: ^
bitslice_fft_256.c: bitslice_fft_256.c:100:22: error: '__builtin_ia32_pslldqi256_byteshift' needs target feature avx2
bitslice_fft_256.c: va = _mm256_slli_si256(out[k][b], 8);
bitslice_fft_256.c: ^
bitslice_fft_256.c: /usr/lib/llvm-11/lib/clang/11.0.1/include/avx2intrin.h:497:12: note: expanded from macro '_mm256_slli_si256'
bitslice_fft_256.c: (__m256i)__builtin_ia32_pslldqi256_byteshift((__v4di)(__m256i)(a), (int)(imm))
bitslice_fft_256.c: ^
bitslice_fft_256.c: bitslice_fft_256.c:101:22: error: always_inline function '_mm256_xor_si256' requires target feature 'avx2', but would be inlined into function 'bitslice_butterflies12_256' that is compiled without support for 'avx2'
bitslice_fft_256.c: vb = _mm256_xor_si256(va, vb);
bitslice_fft_256.c: ^
bitslice_fft_256.c: bitslice_fft_256.c:101:22: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
bitslice_fft_256.c: bitslice_fft_256.c:102:29: error: always_inline function '_mm256_xor_si256' requires target feature 'avx2', but would be inlined into function 'bitslice_butterflies12_256' that is compiled without support for 'avx2'
bitslice_fft_256.c: out[k][b] = _mm256_xor_si256(out[k][b], vb);
bitslice_fft_256.c: ^
bitslice_fft_256.c: bitslice_fft_256.c:102:29: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
bitslice_fft_256.c: bitslice_fft_256.c:112:22: error: always_inline function '_mm256_set_epi64x' requires target feature 'avx', but would be inlined into function 'bitslice_butterflies12_256' that is compiled without support for 'avx'
bitslice_fft_256.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2