Implementation notes: x86, bolero, crypto_sign/sphincss256shake256

Computer: bolero
Architecture: x86
CPU ID: GenuineIntel-000406f1-bfebfbff
SUPERCOP version: 20190110
Operation: crypto_sign
Primitive: sphincss256shake256
TimeImplementationCompilerBenchmark dateSUPERCOP version
7675812556avx2gcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2018101020180818
7677282248avx2gcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2018101020180818
12882438944avx2gcc -m32 -march=core-avx2 -O -fomit-frame-pointer2018101020180818
12884727500avx2gcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2018101020180818
12954575608avx2gcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2018101020180818
12955991960avx2gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2018101020180818
13438171808avx2gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer2018101020180818
13447379644avx2gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer2018101020180818
37506458612refgcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2018101120180818
37571147648refgcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2018101020180818
40536906432refgcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer2018101120180818
40540482856refgcc -m32 -march=barcelona -O3 -fomit-frame-pointer2018101020180818
40550753308refgcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer2018101220180818
40554729624refgcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer2018101120180818
40659492808refgcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer2018101020180818
40662118124refgcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer2018101220180818
40749502268refgcc -m32 -march=corei7 -O3 -fomit-frame-pointer2018101120180818
40844636188refgcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2018101120180818
40848254572refgcc -m32 -march=core2 -O3 -fomit-frame-pointer2018101120180818
40931247884refgcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer2018101220180818
40935051892refgcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer2018101220180818
40970583288refgcc -m32 -march=k6-3 -O3 -fomit-frame-pointer2018101120180818
41011356976refgcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer2018101220180818
41011516708refgcc -m32 -march=athlon -O3 -fomit-frame-pointer2018101120180818
41013858552refgcc -m32 -march=k6-2 -O3 -fomit-frame-pointer2018101120180818
41048344024refgcc -funroll-loops -m32 -O3 -fomit-frame-pointer2018101220180818
41072655596refgcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer2018101220180818
41081117296refgcc -m32 -march=k6 -O3 -fomit-frame-pointer2018101120180818
41163008680refgcc -m32 -O3 -fomit-frame-pointer2018101020180818
41258467332refgcc -m32 -march=k8 -O3 -fomit-frame-pointer2018101120180818
41322778300refgcc -m32 -march=nocona -O3 -fomit-frame-pointer2018101120180818
41359487936refgcc -m32 -march=prescott -O3 -fomit-frame-pointer2018101120180818
41453629676refgcc -m32 -march=pentium4 -O3 -fomit-frame-pointer2018101120180818
41532982684refgcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer2018101320180818
41694269596refgcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer2018101220180818
41703706184refgcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer2018101220180818
42938879928refgcc -m32 -march=i486 -O3 -fomit-frame-pointer2018101220180818
43033783908refgcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer2018101320180818
43388795640refgcc -m32 -march=i386 -O3 -fomit-frame-pointer2018101220180818
43607407840refgcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer2018101320180818
44131192916refgcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2018101020180818
44169649424refgcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2018101120180818
44275109476refgcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2018101020180818
44294151088refgcc -m32 -march=core-avx2 -O -fomit-frame-pointer2018101120180818
44664954240refgcc -funroll-loops -m32 -O -fomit-frame-pointer2018101220180818
44832868208refgcc -m32 -march=pentium-m -O3 -fomit-frame-pointer2018101120180818
44930804800refgcc -m32 -march=pentium2 -O3 -fomit-frame-pointer2018101220180818
44936667532refgcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer2018101220180818
44952240000refgcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer2018101220180818
45045250624refgcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer2018101320180818
45048030736refgcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer2018101220180818
45051698652refgcc -m32 -march=pentium3 -O3 -fomit-frame-pointer2018101220180818
45056360072refgcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer2018101220180818
45132653864refgcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer2018101220180818
45173536872refgcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer2018101220180818
45256807064refgcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer2018101220180818
45288045140refgcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer2018101320180818
45297540628refgcc -funroll-loops -m32 -O2 -fomit-frame-pointer2018101220180818
45414981996refgcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer2018101220180818
45444522180refgcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer2018101220180818
45445432932refgcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer2018101220180818
45454008656refgcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer2018101220180818
45535116216refgcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer2018101320180818

Test failure

Implementation: crypto_sign/sphincss256shake256/ref
Compiler: gcc -funroll-loops -m32 -Os -fomit-frame-pointer
error 142
Alarm clock

Number of similar (compiler,implementation) pairs: 118, namely:
CompilerImplementations
gcc -funroll-loops -m32 -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer ref
gcc -m32 -O2 -fomit-frame-pointer ref
gcc -m32 -O -fomit-frame-pointer ref
gcc -m32 -Os -fomit-frame-pointer ref
gcc -m32 -march=athlon -O2 -fomit-frame-pointer ref
gcc -m32 -march=athlon -O -fomit-frame-pointer ref
gcc -m32 -march=athlon -Os -fomit-frame-pointer ref
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer ref
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer ref
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer ref
gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer ref
gcc -m32 -march=core2 -O2 -fomit-frame-pointer ref
gcc -m32 -march=core2 -O -fomit-frame-pointer ref
gcc -m32 -march=core2 -Os -fomit-frame-pointer ref
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer ref
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer ref
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer ref
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer ref
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer ref
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer ref
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer ref
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer ref
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer ref
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer ref
gcc -m32 -march=corei7 -O -fomit-frame-pointer ref
gcc -m32 -march=corei7 -Os -fomit-frame-pointer ref
gcc -m32 -march=i386 -O2 -fomit-frame-pointer ref
gcc -m32 -march=i386 -O -fomit-frame-pointer ref
gcc -m32 -march=i386 -Os -fomit-frame-pointer ref
gcc -m32 -march=i486 -O2 -fomit-frame-pointer ref
gcc -m32 -march=i486 -O -fomit-frame-pointer ref
gcc -m32 -march=i486 -Os -fomit-frame-pointer ref
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer ref
gcc -m32 -march=k6-2 -O -fomit-frame-pointer ref
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer ref
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer ref
gcc -m32 -march=k6-3 -O -fomit-frame-pointer ref
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer ref
gcc -m32 -march=k6 -O2 -fomit-frame-pointer ref
gcc -m32 -march=k6 -O -fomit-frame-pointer ref
gcc -m32 -march=k6 -Os -fomit-frame-pointer ref
gcc -m32 -march=k8 -O2 -fomit-frame-pointer ref
gcc -m32 -march=k8 -O -fomit-frame-pointer ref
gcc -m32 -march=k8 -Os -fomit-frame-pointer ref
gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer ref
gcc -m32 -march=nocona -O2 -fomit-frame-pointer ref
gcc -m32 -march=nocona -O -fomit-frame-pointer ref
gcc -m32 -march=nocona -Os -fomit-frame-pointer ref
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer ref
gcc -m32 -march=pentium-m -O -fomit-frame-pointer ref
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer ref
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer ref
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer ref
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer ref
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer ref
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer ref
gcc -m32 -march=pentium2 -O -fomit-frame-pointer ref
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer ref
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer ref
gcc -m32 -march=pentium3 -O -fomit-frame-pointer ref
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer ref
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer ref
gcc -m32 -march=pentium4 -O -fomit-frame-pointer ref
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer ref
gcc -m32 -march=pentium -O2 -fomit-frame-pointer ref
gcc -m32 -march=pentium -O3 -fomit-frame-pointer ref
gcc -m32 -march=pentium -O -fomit-frame-pointer ref
gcc -m32 -march=pentium -Os -fomit-frame-pointer ref
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer ref
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer ref
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer ref
gcc -m32 -march=prescott -O2 -fomit-frame-pointer ref
gcc -m32 -march=prescott -O -fomit-frame-pointer ref
gcc -m32 -march=prescott -Os -fomit-frame-pointer ref

Test failure

Implementation: crypto_sign/sphincss256shake256/ref
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
error 142
Alarm clock
error 142
Alarm clock

Number of similar (compiler,implementation) pairs: 3, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer ref
gcc -m32 -march=barcelona -O -fomit-frame-pointer ref
gcc -m32 -march=barcelona -Os -fomit-frame-pointer ref

Compiler output

Implementation: crypto_sign/sphincss256shake256/avx2
Compiler: gcc -funroll-loops -m32 -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/5/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: error: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/5/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 156, namely:
CompilerImplementations
gcc -funroll-loops -m32 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer avx2
gcc -m32 -O2 -fomit-frame-pointer avx2
gcc -m32 -O3 -fomit-frame-pointer avx2
gcc -m32 -O -fomit-frame-pointer avx2
gcc -m32 -Os -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -m32 -march=prescott -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/sphincss256shake256/avx2
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/5/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: error: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/5/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/5/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: error: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/5/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/sphincss256shake256/avx2
Compiler: gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/5/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: error: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/5/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: error: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:142:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+2], lanes2 ),\
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer avx2