Implementation notes: amd64, h8bobcat, crypto_hash/shake128

Computer: h8bobcat
Microarchitecture: amd64; Bobcat (500f10)
Architecture: amd64
CPU ID: AuthenticAMD-00500f20-178bfbff
SUPERCOP version: 20240107
Operation: crypto_hash
Primitive: shake128
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
26196713 0 067822 776 800oncore64bitsgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
26220703 0 067518 776 800oncore64bitsgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
262681902 0 070534 776 800oncore64bitsgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
26272581 0 066398 808 728oncore64bitsclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
26277958 0 069508 816 728oncore64bitsclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
26282958 0 068364 816 728oncore64bitsclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
26311617 0 066542 768 768oncore64bitsgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
263581213 0 069980 816 728oncore64bitsclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
26439711 0 067140 816 728oncore64bitsclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
29640219 0 012920 864 728opensslclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
29664219 0 010194 856 728opensslclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
29702219 0 011776 864 728opensslclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
29859217 0 010808 864 728opensslclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
29915219 0 013136 864 728opensslclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
30001295 0 013625 840 768opensslgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
30624266 0 011664 832 768opensslgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
31151295 0 012081 840 768opensslgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
31592244 0 010628 816 768opensslgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
325284192 360 01719865 144664 10264cryptoppg++_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
325333953 360 01721311 144656 10264cryptoppg++_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
325473290 360 01716469 144696 10168cryptoppclang++_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
325573879 360 01719543 144664 10264cryptoppg++_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
325573043 360 01718159 144696 10232cryptoppg++_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
325803259 360 01717433 144696 10168cryptoppclang++_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
325803236 360 01718537 144696 10168cryptoppclang++_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
325803534 360 01716647 144704 10168cryptoppclang++_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
648233598 0 029015 824 728oncore32bitsclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
648332462 0 026519 824 728oncore32bitsclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
648423630 0 028831 824 728oncore32bitsclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
659026389 0 032316 792 800oncore32bitsgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
679011250 0 023809 816 728oncore32bitsclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
681151304 0 024471 824 728oncore32bitsclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
687421817 0 026204 792 800oncore32bitsgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
704381850 0 025756 776 800oncore32bitsgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
823222101 0 024959 768 768oncore32bitsgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212

Compiler output

Implementation: kcp/optimized1600ARMv7A
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
keccak.s: keccak.s:1:1: error: unexpected token at start of statement
keccak.s: @
keccak.s: ^
keccak.s: keccak.s:2:1: error: unexpected token at start of statement
keccak.s: @ Implementation by the Keccak, Keyak and Ketje Teams, namely, Guido Bertoni,
keccak.s: ^
keccak.s: keccak.s:3:1: error: unexpected token at start of statement
keccak.s: @ Joan Daemen, Michaƫl Peeters, Gilles Van Assche and Ronny Van Keer, hereby
keccak.s: keccak.s:4:1: error: unexpected token at start of statement
keccak.s: @ denoted as "the implementer".
keccak.s: ^
keccak.s: keccak.s:5:1: error: unexpected token at start of statement
keccak.s: @
keccak.s: ^
keccak.s: keccak.s:6:1: error: unexpected token at start of statement
keccak.s: @ For more information, feedback or questions, please refer to our websites:
keccak.s: ^
keccak.s: keccak.s:7:1: error: unexpected token at start of statement
keccak.s: @ http://keccak.noekeon.org/
keccak.s: ^
keccak.s: keccak.s:8:1: error: unexpected token at start of statement
keccak.s: @ http://keyak.noekeon.org/
keccak.s: ^
keccak.s: keccak.s:9:1: error: unexpected token at start of statement
keccak.s: @ http://ketje.noekeon.org/
keccak.s: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600ARMv7A
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600ARMv7A
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600ARMv7A
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600ARMv7A
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600ARMv7A

Compiler output

Implementation: kcp/optimized1600ARMv7A
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
keccak.s: keccak.s: Assembler messages:
keccak.s: keccak.s:1: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:2: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:3: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:4: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:5: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:6: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:7: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:8: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:9: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:10: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:11: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:12: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:13: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:14: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:16: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:17: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:18: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:21: Error: unknown pseudo-op: `.syntax'
keccak.s: keccak.s:22: Error: unknown pseudo-op: `.fpu'
keccak.s: keccak.s:23: Error: unknown pseudo-op: `.arm'
keccak.s: keccak.s:26: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:27: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:29: Error: junk at end of line, first unrecognized character is `@'
keccak.s: keccak.s:56: Error: junk at end of line, first unrecognized character is `@'
keccak.s: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE kcp/optimized1600ARMv7A
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE kcp/optimized1600ARMv7A
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE kcp/optimized1600ARMv7A
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE kcp/optimized1600ARMv7A

Compiler output

Implementation: kcp/optimized1600ARMv8A
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
keccak.s: keccak.s:258:20: error: unknown token in expression
keccak.s: movi v0.2d, #0
keccak.s: ^
keccak.s: keccak.s:259:20: error: unknown token in expression
keccak.s: movi v1.2d, #0
keccak.s: ^
keccak.s: keccak.s:260:20: error: unknown token in expression
keccak.s: movi v2.2d, #0
keccak.s: ^
keccak.s: keccak.s:261:20: error: unknown token in expression
keccak.s: movi v3.2d, #0
keccak.s: ^
keccak.s: keccak.s:262:15: error: unknown token in expression
keccak.s: st4 { v0.2d, v1.2d, v2.2d, v3.2d }, [x0], #64 // Clear 8lanes=64 bytes at a time
keccak.s: ^
keccak.s: keccak.s:263:15: error: unknown token in expression
keccak.s: st4 { v0.2d, v1.2d, v2.2d, v3.2d }, [x0], #64
keccak.s: ^
keccak.s: keccak.s:264:15: error: unknown token in expression
keccak.s: st4 { v0.2d, v1.2d, v2.2d, v3.2d }, [x0], #64
keccak.s: ^
keccak.s: keccak.s:265:15: error: unknown token in expression
keccak.s: st1 { v0.d }[0], [x0], #8
keccak.s: ^
keccak.s: keccak.s:276:20: error: expected ']' in brackets expression
keccak.s: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600ARMv8A
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600ARMv8A
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600ARMv8A
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600ARMv8A
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600ARMv8A

Compiler output

Implementation: kcp/optimized1600ARMv8A
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
keccak.s: keccak.s: Assembler messages:
keccak.s: keccak.s:258: Error: no such instruction: `movi v0.2d,'
keccak.s: keccak.s:259: Error: no such instruction: `movi v1.2d,'
keccak.s: keccak.s:260: Error: no such instruction: `movi v2.2d,'
keccak.s: keccak.s:261: Error: no such instruction: `movi v3.2d,'
keccak.s: keccak.s:262: Error: no such instruction: `st4 { v0.2d,v1.2d,v2.2d,v3.2d },[x0],'
keccak.s: keccak.s:263: Error: no such instruction: `st4 { v0.2d,v1.2d,v2.2d,v3.2d },[x0],'
keccak.s: keccak.s:264: Error: no such instruction: `st4 { v0.2d,v1.2d,v2.2d,v3.2d },[x0],'
keccak.s: keccak.s:265: Error: no such instruction: `st1 { v0.d }[0],[x0],'
keccak.s: keccak.s:276: Error: no such instruction: `ldrb w3,[x0,x2]'
keccak.s: keccak.s:277: Error: no such instruction: `eor w3,w3,w1'
keccak.s: keccak.s:278: Error: too many memory references for `str'
keccak.s: keccak.s:289: Error: too many memory references for `add'
keccak.s: keccak.s:290: Error: too many memory references for `sub'
keccak.s: keccak.s:291: Error: no such instruction: `b.cc KeccakP1600_AddBytes_Exit//length 0,move along'
keccak.s: keccak.s:293: Error: too many memory references for `sub'
keccak.s: keccak.s:294: Error: no such instruction: `b.cc KeccakP1600_AddBytes_Lanes//Jump if length is negative'
keccak.s: keccak.s:295: Error: no such instruction: `ld4 { v0.2d,v1.2d,v2.2d,v3.2d },[x0]'
keccak.s: keccak.s:296: Error: no such instruction: `ld4 { v4.2d,v5.2d,v6.2d,v7.2d },[x1],'
keccak.s: keccak.s:297: Error: no such instruction: `eor v0.16b,v0.16b,v4.16b'
keccak.s: keccak.s:298: Error: no such instruction: `eor v1.16b,v1.16b,v5.16b'
keccak.s: keccak.s:299: Error: no such instruction: `eor v2.16b,v2.16b,v6.16b'
keccak.s: keccak.s:300: Error: no such instruction: `eor v3.16b,v3.16b,v7.16b'
keccak.s: keccak.s:301: Error: no such instruction: `st4 { v0.2d,v1.2d,v2.2d,v3.2d },[x0],'
keccak.s: keccak.s:302: Error: no such instruction: `b KeccakP1600_AddBytes_8LanesLoop'
keccak.s: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE kcp/optimized1600ARMv8A
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE kcp/optimized1600ARMv8A
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE kcp/optimized1600ARMv8A
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE kcp/optimized1600ARMv8A

Compiler output

Implementation: kcp/optimized1600AVX2
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
KeccakP-1600-AVX2.c: KeccakP-1600-AVX2.c:506:13: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'crypto_hash_shake128_kcp_optimized1600AVX2_constbranchindex_KeccakP1600_AddBytes' that is compiled without support for 'avx'
KeccakP-1600-AVX2.c: s->a0 = LOAD(t + 0*5);
KeccakP-1600-AVX2.c: ^
KeccakP-1600-AVX2.c: KeccakP-1600-AVX2.c:62:41: note: expanded from macro 'LOAD'
KeccakP-1600-AVX2.c: #define LOAD(p) _mm256_loadu_si256((const __m256i *)(p))
KeccakP-1600-AVX2.c: ^
KeccakP-1600-AVX2.c: KeccakP-1600-AVX2.c:506:13: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
KeccakP-1600-AVX2.c: KeccakP-1600-AVX2.c:62:41: note: expanded from macro 'LOAD'
KeccakP-1600-AVX2.c: #define LOAD(p) _mm256_loadu_si256((const __m256i *)(p))
KeccakP-1600-AVX2.c: ^
KeccakP-1600-AVX2.c: KeccakP-1600-AVX2.c:507:13: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'crypto_hash_shake128_kcp_optimized1600AVX2_constbranchindex_KeccakP1600_AddBytes' that is compiled without support for 'avx'
KeccakP-1600-AVX2.c: s->a1 = LOAD(t + 1*5);
KeccakP-1600-AVX2.c: ^
KeccakP-1600-AVX2.c: KeccakP-1600-AVX2.c:62:41: note: expanded from macro 'LOAD'
KeccakP-1600-AVX2.c: #define LOAD(p) _mm256_loadu_si256((const __m256i *)(p))
KeccakP-1600-AVX2.c: ^
KeccakP-1600-AVX2.c: KeccakP-1600-AVX2.c:507:13: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
KeccakP-1600-AVX2.c: KeccakP-1600-AVX2.c:62:41: note: expanded from macro 'LOAD'
KeccakP-1600-AVX2.c: #define LOAD(p) _mm256_loadu_si256((const __m256i *)(p))
KeccakP-1600-AVX2.c: ^
KeccakP-1600-AVX2.c: KeccakP-1600-AVX2.c:508:13: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'crypto_hash_shake128_kcp_optimized1600AVX2_constbranchindex_KeccakP1600_AddBytes' that is compiled without support for 'avx'
KeccakP-1600-AVX2.c: s->a2 = LOAD(t + 2*5);
KeccakP-1600-AVX2.c: ^
KeccakP-1600-AVX2.c: KeccakP-1600-AVX2.c:62:41: note: expanded from macro 'LOAD'
KeccakP-1600-AVX2.c: #define LOAD(p) _mm256_loadu_si256((const __m256i *)(p))
KeccakP-1600-AVX2.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600AVX2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600AVX2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600AVX2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600AVX2
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600AVX2

Compiler output

Implementation: kcp/optimized1600AVX2
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
KeccakP-1600-AVX2.c: KeccakP-1600-AVX2.c: In function 'crypto_hash_shake128_kcp_optimized1600AVX2_constbranchindex_KeccakP1600_AddBytes':
KeccakP-1600-AVX2.c: KeccakP-1600-AVX2.c:506:11: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-AVX2.c: 506 | s->a0 = LOAD(t + 0*5);
KeccakP-1600-AVX2.c: | ^
KeccakP-1600-AVX2.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:43,
KeccakP-1600-AVX2.c: from /usr/lib/gcc/x86_64-linux-gnu/11/include/x86intrin.h:32,
KeccakP-1600-AVX2.c: from KeccakP-1600-AVX2.c:20:
KeccakP-1600-AVX2.c: KeccakP-1600-AVX2.c: In function 'crypto_hash_shake128_kcp_optimized1600AVX2_constbranchindex_KeccakP1600_ExtractBytes':
KeccakP-1600-AVX2.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avxintrin.h:933:1: error: inlining failed in call to 'always_inline' '_mm256_storeu_si256': target specific option mismatch
KeccakP-1600-AVX2.c: 933 | _mm256_storeu_si256 (__m256i_u *__P, __m256i __A)
KeccakP-1600-AVX2.c: | ^~~~~~~~~~~~~~~~~~~
KeccakP-1600-AVX2.c: KeccakP-1600-AVX2.c:63:41: note: called from here
KeccakP-1600-AVX2.c: 63 | #define STORE(p, a) _mm256_storeu_si256((__m256i *)(p), a)
KeccakP-1600-AVX2.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-AVX2.c: KeccakP-1600-AVX2.c:589:5: note: in expansion of macro 'STORE'
KeccakP-1600-AVX2.c: 589 | STORE(d + 4*5, s->a4);
KeccakP-1600-AVX2.c: | ^~~~~
KeccakP-1600-AVX2.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:43,
KeccakP-1600-AVX2.c: from /usr/lib/gcc/x86_64-linux-gnu/11/include/x86intrin.h:32,
KeccakP-1600-AVX2.c: from KeccakP-1600-AVX2.c:20:
KeccakP-1600-AVX2.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avxintrin.h:933:1: error: inlining failed in call to 'always_inline' '_mm256_storeu_si256': target specific option mismatch
KeccakP-1600-AVX2.c: 933 | _mm256_storeu_si256 (__m256i_u *__P, __m256i __A)
KeccakP-1600-AVX2.c: | ^~~~~~~~~~~~~~~~~~~
KeccakP-1600-AVX2.c: KeccakP-1600-AVX2.c:63:41: note: called from here
KeccakP-1600-AVX2.c: 63 | #define STORE(p, a) _mm256_storeu_si256((__m256i *)(p), a)
KeccakP-1600-AVX2.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE kcp/optimized1600AVX2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE kcp/optimized1600AVX2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE kcp/optimized1600AVX2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE kcp/optimized1600AVX2

Compiler output

Implementation: kcp/optimized1600AVX512
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
KeccakP-1600-AVX512.c: KeccakP-1600-AVX512.c:238:41: error: always_inline function '_mm512_maskz_loadu_epi64' requires target feature 'avx512f', but would be inlined into function 'KeccakP1600_AddBytes' that is compiled without support for 'avx512f'
KeccakP-1600-AVX512.c: STORE_8Lanes( stateAsLanes, XOR(LOAD_8Lanes(stateAsLanes), LOAD_8Lanes((const UINT64*)data)));
KeccakP-1600-AVX512.c: ^
KeccakP-1600-AVX512.c: KeccakP-1600-AVX512.c:215:37: note: expanded from macro 'LOAD_8Lanes'
KeccakP-1600-AVX512.c: #define LOAD_8Lanes(a) LOAD_Lanes(0xFF,a)
KeccakP-1600-AVX512.c: ^
KeccakP-1600-AVX512.c: KeccakP-1600-AVX512.c:212:37: note: expanded from macro 'LOAD_Lanes'
KeccakP-1600-AVX512.c: #define LOAD_Lanes(m,a) _mm512_maskz_loadu_epi64(m,a)
KeccakP-1600-AVX512.c: ^
KeccakP-1600-AVX512.c: KeccakP-1600-AVX512.c:238:41: error: AVX vector return of type '__m512i' (vector of 8 'long long' values) without 'avx512f' enabled changes the ABI
KeccakP-1600-AVX512.c: KeccakP-1600-AVX512.c:215:37: note: expanded from macro 'LOAD_8Lanes'
KeccakP-1600-AVX512.c: #define LOAD_8Lanes(a) LOAD_Lanes(0xFF,a)
KeccakP-1600-AVX512.c: ^
KeccakP-1600-AVX512.c: KeccakP-1600-AVX512.c:212:37: note: expanded from macro 'LOAD_Lanes'
KeccakP-1600-AVX512.c: #define LOAD_Lanes(m,a) _mm512_maskz_loadu_epi64(m,a)
KeccakP-1600-AVX512.c: ^
KeccakP-1600-AVX512.c: KeccakP-1600-AVX512.c:238:68: error: always_inline function '_mm512_maskz_loadu_epi64' requires target feature 'avx512f', but would be inlined into function 'KeccakP1600_AddBytes' that is compiled without support for 'avx512f'
KeccakP-1600-AVX512.c: STORE_8Lanes( stateAsLanes, XOR(LOAD_8Lanes(stateAsLanes), LOAD_8Lanes((const UINT64*)data)));
KeccakP-1600-AVX512.c: ^
KeccakP-1600-AVX512.c: KeccakP-1600-AVX512.c:215:37: note: expanded from macro 'LOAD_8Lanes'
KeccakP-1600-AVX512.c: #define LOAD_8Lanes(a) LOAD_Lanes(0xFF,a)
KeccakP-1600-AVX512.c: ^
KeccakP-1600-AVX512.c: KeccakP-1600-AVX512.c:212:37: note: expanded from macro 'LOAD_Lanes'
KeccakP-1600-AVX512.c: #define LOAD_Lanes(m,a) _mm512_maskz_loadu_epi64(m,a)
KeccakP-1600-AVX512.c: ^
KeccakP-1600-AVX512.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600AVX512
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600AVX512
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600AVX512
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600AVX512
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE kcp/optimized1600AVX512

Compiler output

Implementation: kcp/optimized1600AVX512
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
KeccakP-1600-AVX512.c: KeccakP-1600-AVX512.c: In function 'KeccakP1600_AddBytes':
KeccakP-1600-AVX512.c: KeccakP-1600-AVX512.c:216:37: warning: AVX512F vector return without AVX512F enabled changes the ABI [-Wpsabi]
KeccakP-1600-AVX512.c: 216 | #define STORE_Lanes(a,m,v) _mm512_mask_storeu_epi64(a,m,v)
KeccakP-1600-AVX512.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-AVX512.c: KeccakP-1600-AVX512.c:219:37: note: in expansion of macro 'STORE_Lanes'
KeccakP-1600-AVX512.c: 219 | #define STORE_8Lanes(a,v) STORE_Lanes(a,0xFF,v)
KeccakP-1600-AVX512.c: | ^~~~~~~~~~~
KeccakP-1600-AVX512.c: KeccakP-1600-AVX512.c:238:9: note: in expansion of macro 'STORE_8Lanes'
KeccakP-1600-AVX512.c: 238 | STORE_8Lanes( stateAsLanes, XOR(LOAD_8Lanes(stateAsLanes), LOAD_8Lanes((const UINT64*)data)));
KeccakP-1600-AVX512.c: | ^~~~~~~~~~~~
KeccakP-1600-AVX512.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:49,
KeccakP-1600-AVX512.c: from KeccakP-1600-AVX512.c:26:
KeccakP-1600-AVX512.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx512fintrin.h:6440:1: error: inlining failed in call to 'always_inline' '_mm512_mask_storeu_epi64': target specific option mismatch
KeccakP-1600-AVX512.c: 6440 | _mm512_mask_storeu_epi64 (void *__P, __mmask8 __U, __m512i __A)
KeccakP-1600-AVX512.c: | ^~~~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-AVX512.c: KeccakP-1600-AVX512.c:216:37: note: called from here
KeccakP-1600-AVX512.c: 216 | #define STORE_Lanes(a,m,v) _mm512_mask_storeu_epi64(a,m,v)
KeccakP-1600-AVX512.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-AVX512.c: KeccakP-1600-AVX512.c:219:37: note: in expansion of macro 'STORE_Lanes'
KeccakP-1600-AVX512.c: 219 | #define STORE_8Lanes(a,v) STORE_Lanes(a,0xFF,v)
KeccakP-1600-AVX512.c: | ^~~~~~~~~~~
KeccakP-1600-AVX512.c: KeccakP-1600-AVX512.c:238:9: note: in expansion of macro 'STORE_8Lanes'
KeccakP-1600-AVX512.c: 238 | STORE_8Lanes( stateAsLanes, XOR(LOAD_8Lanes(stateAsLanes), LOAD_8Lanes((const UINT64*)data)));
KeccakP-1600-AVX512.c: | ^~~~~~~~~~~~
KeccakP-1600-AVX512.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:49,
KeccakP-1600-AVX512.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE kcp/optimized1600AVX512
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE kcp/optimized1600AVX512
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE kcp/optimized1600AVX512
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE kcp/optimized1600AVX512