Implementation notes: amd64, shoe, crypto_kem/kyber512

Computer: shoe
Microarchitecture: amd64; Broadwell+AES (306d4)
Architecture: amd64
CPU ID: GenuineIntel-000306d4-bfebfbff
SUPERCOP version: 20240425
Operation: crypto_kem
Primitive: kyber512
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
67946139174 0 0161488 820 1752avx2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024051520240425
68982126451 0 0145182 812 1752avx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024051520240425
69317141640 0 0164032 820 1752avx2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024051520240425
71502151932 0 0173976 788 1784avx2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024051520240425
72294128294 0 0148624 788 1784avx2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024051520240425
73863124927 0 0143800 780 1752avx2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024051520240425
74170127973 0 0147422 812 1752avx2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024051520240425
74172125924 0 0145856 788 1784avx2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024051520240425
23941364488 0 094464 820 1752compactclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024051520240425
25385736121 0 066464 820 1752compactclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024051520240425
26492359759 0 090560 788 1784compactgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024051520240425
31506237339 0 067864 820 1752compactclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024051520240425
32131937275 0 059232 820 1752refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024051520240425
3232876932 0 035022 812 1752compactclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024051520240425
32661526762 0 048616 820 1752refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024051520240425
33362642889 0 064496 788 1784refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024051520240425
34618812221 0 031342 812 1752refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024051520240425
35650930966 0 052960 820 1752refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024051520240425
35927912575 0 032904 788 1784refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024051520240425
3638336451 0 035544 788 1784compactgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024051520240425
38535911688 0 031608 788 1784refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024051520240425
4219835827 0 034616 788 1784compactgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024051520240425
42802011062 0 029912 780 1752refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024051520240425
44392113228 0 032990 812 1752refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024051520240425
5294987819 0 036518 812 1752compactclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024051520240425
5543295326 0 033032 780 1752compactgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024051520240425

Compiler output

Implementation: avx2
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'crypto_kem_kyber512_avx2_constbranchindex_KeccakP1600times4_AddLanesAll' that is compiled without support for 'avx'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:42: note: expanded from macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:44:37: note: expanded from macro 'LOAD256u'
KeccakP-1600-times4-SIMD256.c: #define LOAD256u(a) _mm256_loadu_si256((const V256 *)&(a))
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:42: note: expanded from macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:44:37: note: expanded from macro 'LOAD256u'
KeccakP-1600-times4-SIMD256.c: #define LOAD256u(a) _mm256_loadu_si256((const V256 *)&(a))
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'crypto_kem_kyber512_avx2_constbranchindex_KeccakP1600times4_AddLanesAll' that is compiled without support for 'avx'
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:136:42: note: expanded from macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: lanes1 = LOAD256u( curData1[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:44:37: note: expanded from macro 'LOAD256u'
KeccakP-1600-times4-SIMD256.c: #define LOAD256u(a) _mm256_loadu_si256((const V256 *)&(a))
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:136:42: note: expanded from macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2