Implementation notes: x86, titan0, crypto_kem/kyber1024

Computer: titan0
Architecture: x86
CPU ID: GenuineIntel-000306c3-bfebfbff
SUPERCOP version: 20190803
Operation: crypto_kem
Primitive: kyber1024
TimeImplementationCompilerBenchmark dateSUPERCOP version
1531088refgcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer2019081020190803
1573740refgcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer2019081020190803
1575644refgcc -m32 -march=pentium-m -O3 -fomit-frame-pointer2019081020190803
1578780refgcc -m32 -march=corei7 -O3 -fomit-frame-pointer2019081020190803
1584476refgcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2019081020190803
1586212refgcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer2019081020190803
1597168refgcc -m32 -march=core2 -O3 -fomit-frame-pointer2019081020190803
1643424refgcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2019081020190803
1644540refgcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2019081020190803
1655728refgcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer2019081020190803
1664032refgcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer2019081020190803
1671516refgcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2019081020190803
1674992refgcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2019081020190803
1684308refgcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer2019081020190803
1687436refgcc -m32 -march=corei7 -O2 -fomit-frame-pointer2019081020190803
1691864refgcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer2019081020190803
1693244refgcc -m32 -march=pentium-m -O2 -fomit-frame-pointer2019081020190803
1701216refgcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2019081020190803
1706976refgcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer2019081020190803
1721528refgcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer2019081020190803
1733108refgcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer2019081020190803
1737012refgcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer2019081020190803
1737504refgcc -m32 -march=core-avx2 -Os -fomit-frame-pointer2019081020190803
1740532refgcc -m32 -march=core2 -O2 -fomit-frame-pointer2019081020190803
1748772refgcc -m32 -march=core-avx-i -Os -fomit-frame-pointer2019081020190803
1752496refgcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer2019081020190803
1756600refgcc -m32 -march=corei7-avx -Os -fomit-frame-pointer2019081020190803
1780512refgcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2019081020190803
1781156refgcc -m32 -march=core-avx2 -O -fomit-frame-pointer2019081020190803
1811228refgcc -m32 -march=pentium-m -O -fomit-frame-pointer2019081020190803
1837420refgcc -m32 -march=k8 -O -fomit-frame-pointer2019081020190803
1861236refgcc -m32 -march=barcelona -O -fomit-frame-pointer2019081020190803
1875200refgcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer2019081020190803
1883100refgcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer2019081020190803
1888500refgcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer2019081020190803
1888792refgcc -m32 -march=core2 -O -fomit-frame-pointer2019081020190803
1892072refgcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer2019081020190803
1893404refgcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer2019081020190803
1898520refgcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer2019081020190803
1904196refgcc -m32 -march=corei7 -O -fomit-frame-pointer2019081020190803
1904316refgcc -m32 -march=prescott -O3 -fomit-frame-pointer2019081020190803
1906164refgcc -m32 -march=core-avx-i -O -fomit-frame-pointer2019081020190803
1906356refgcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer2019081020190803
1906480refgcc -m32 -march=corei7-avx -O -fomit-frame-pointer2019081020190803
1910320refgcc -m32 -march=nocona -O3 -fomit-frame-pointer2019081020190803
1915704refgcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer2019081020190803
1916808refgcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer2019081020190803
1929692refgcc -funroll-loops -m32 -O3 -fomit-frame-pointer2019081020190803
1946100refgcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer2019081020190803
1955488refgcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer2019081020190803
1958676refgcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer2019081020190803
1964344refgcc -m32 -march=athlon -O3 -fomit-frame-pointer2019081020190803
1965952refgcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer2019081020190803
1967604refgcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer2019081020190803
1968852refgcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer2019081020190803
1972756refgcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019081020190803
1973312refgcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer2019081020190803
1989360refgcc -m32 -O3 -fomit-frame-pointer2019081020190803
1991924refgcc -funroll-loops -m32 -O -fomit-frame-pointer2019081020190803
1998544refgcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer2019081020190803
1999508refgcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer2019081020190803
2000876refgcc -funroll-loops -m32 -O2 -fomit-frame-pointer2019081020190803
2002376refgcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer2019081020190803
2011412refgcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer2019081020190803
2014864refgcc -m32 -march=pentium-m -Os -fomit-frame-pointer2019081020190803
2017628refgcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019081020190803
2018180refgcc -m32 -march=prescott -O2 -fomit-frame-pointer2019081020190803
2018644refgcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer2019081020190803
2020704refgcc -m32 -march=core2 -Os -fomit-frame-pointer2019081020190803
2021324refgcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer2019081020190803
2022772refgcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer2019081020190803
2023396refgcc -m32 -march=k6-2 -O3 -fomit-frame-pointer2019081020190803
2024184refgcc -m32 -march=nocona -O -fomit-frame-pointer2019081020190803
2029900refgcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer2019081020190803
2034964refgcc -m32 -march=pentium4 -Os -fomit-frame-pointer2019081020190803
2036412refgcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer2019081020190803
2036620refgcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019081020190803
2039692refgcc -m32 -march=prescott -O -fomit-frame-pointer2019081020190803
2041684refgcc -m32 -march=nocona -O2 -fomit-frame-pointer2019081020190803
2046468refgcc -m32 -march=nocona -Os -fomit-frame-pointer2019081020190803
2047788refgcc -m32 -march=prescott -Os -fomit-frame-pointer2019081020190803
2047852refgcc -m32 -march=k6-3 -O3 -fomit-frame-pointer2019081020190803
2051272refgcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer2019081020190803
2051864refgcc -m32 -march=pentium2 -O3 -fomit-frame-pointer2019081020190803
2052420refgcc -m32 -march=corei7 -Os -fomit-frame-pointer2019081020190803
2055828refgcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer2019081020190803
2056264refgcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer2019081020190803
2056836refgcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer2019081020190803
2058232refgcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer2019081020190803
2059000refgcc -m32 -march=k6 -O3 -fomit-frame-pointer2019081020190803
2060824refgcc -m32 -march=pentium4 -O3 -fomit-frame-pointer2019081020190803
2061052refgcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer2019081020190803
2063208refgcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer2019081020190803
2063300refgcc -m32 -march=pentium3 -O3 -fomit-frame-pointer2019081020190803
2063852refgcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer2019081020190803
2066996refgcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer2019081020190803
2070256refgcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer2019081020190803
2070380refgcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer2019081020190803
2102676refgcc -m32 -march=pentium2 -O -fomit-frame-pointer2019081020190803
2105936refgcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer2019081020190803
2106484refgcc -m32 -march=pentium3 -O -fomit-frame-pointer2019081020190803
2109680refgcc -m32 -march=pentiumpro -O -fomit-frame-pointer2019081020190803
2116144refgcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer2019081020190803
2120548refgcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer2019081020190803
2121888refgcc -m32 -march=k6 -Os -fomit-frame-pointer2019081020190803
2125852refgcc -m32 -march=athlon -O -fomit-frame-pointer2019081020190803
2133532refgcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019081020190803
2134224refgcc -m32 -march=k6-3 -Os -fomit-frame-pointer2019081020190803
2139708refgcc -m32 -Os -fomit-frame-pointer2019081020190803
2140560refgcc -m32 -march=k6-2 -Os -fomit-frame-pointer2019081020190803
2147844refgcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer2019081020190803
2159216refgcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer2019081020190803
2160032refgcc -funroll-loops -m32 -Os -fomit-frame-pointer2019081020190803
2160452refgcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer2019081020190803
2160744refgcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer2019081020190803
2163952refgcc -m32 -march=athlon -Os -fomit-frame-pointer2019081020190803
2170848refgcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019081020190803
2175608refgcc -m32 -march=i486 -Os -fomit-frame-pointer2019081020190803
2175752refgcc -m32 -march=athlon -O2 -fomit-frame-pointer2019081020190803
2177244refgcc -m32 -O -fomit-frame-pointer2019081020190803
2178668refgcc -m32 -march=pentium -Os -fomit-frame-pointer2019081020190803
2181984refgcc -m32 -O2 -fomit-frame-pointer2019081020190803
2184156refgcc -m32 -march=i386 -Os -fomit-frame-pointer2019081020190803
2184184refgcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019081020190803
2188656refgcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer2019081020190803
2188840refgcc -m32 -march=pentium4 -O2 -fomit-frame-pointer2019081020190803
2195688refgcc -m32 -march=pentium3 -O2 -fomit-frame-pointer2019081020190803
2198380refgcc -m32 -march=k6-2 -O -fomit-frame-pointer2019081020190803
2200000refgcc -m32 -march=pentium4 -O -fomit-frame-pointer2019081020190803
2201148refgcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer2019081020190803
2201616refgcc -m32 -march=k6 -O -fomit-frame-pointer2019081020190803
2203716refgcc -m32 -march=k6-3 -O -fomit-frame-pointer2019081020190803
2206896refgcc -m32 -march=pentium2 -O2 -fomit-frame-pointer2019081020190803
2212504refgcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer2019081020190803
2212532refgcc -m32 -march=pentiumpro -Os -fomit-frame-pointer2019081020190803
2222928refgcc -m32 -march=pentium3 -Os -fomit-frame-pointer2019081020190803
2225592refgcc -m32 -march=pentium2 -Os -fomit-frame-pointer2019081020190803
2270932refgcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer2019081020190803
2273928refgcc -m32 -march=k6-3 -O2 -fomit-frame-pointer2019081020190803
2275048refgcc -m32 -march=k6 -O2 -fomit-frame-pointer2019081020190803
2276820refgcc -m32 -march=k6-2 -O2 -fomit-frame-pointer2019081020190803
2287684refgcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer2019081020190803
2303052refgcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer2019081020190803
2330272refgcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer2019081020190803
2331924refgcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer2019081020190803
2334240refgcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer2019081020190803
2362208refgcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer2019081020190803
2364936refgcc -m32 -march=i386 -O3 -fomit-frame-pointer2019081020190803
2364948refgcc -m32 -march=i486 -O3 -fomit-frame-pointer2019081020190803
2379604refgcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer2019081020190803
2436884refgcc -m32 -march=pentium-mmx -O -fomit-frame-pointer2019081020190803
2453496refgcc -m32 -march=pentium -O -fomit-frame-pointer2019081020190803
2466196refgcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer2019081020190803
2467860refgcc -m32 -march=i486 -O -fomit-frame-pointer2019081020190803
2470984refgcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019081020190803
2486632refgcc -m32 -march=i386 -O -fomit-frame-pointer2019081020190803
2515008refgcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019081020190803
2525792refgcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer2019081020190803
2537572refgcc -m32 -march=i386 -O2 -fomit-frame-pointer2019081020190803
2544708refgcc -m32 -march=i486 -O2 -fomit-frame-pointer2019081020190803
2563812refgcc -m32 -march=pentium -O3 -fomit-frame-pointer2019081020190803
2567116refgcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019081020190803
2650632refgcc -m32 -march=pentium -O2 -fomit-frame-pointer2019081020190803
2656216refgcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019081020190803
3616772refgcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer2019081020190803
3674520refgcc -m32 -march=barcelona -O3 -fomit-frame-pointer2019081020190803
3685384refgcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer2019081020190803
3712672refgcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer2019081020190803
3750632refgcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer2019081020190803
3782716refgcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer2019081020190803
3801416refgcc -m32 -march=barcelona -Os -fomit-frame-pointer2019081020190803
3833496refgcc -m32 -march=k8 -O3 -fomit-frame-pointer2019081020190803
3853636refgcc -m32 -march=barcelona -O2 -fomit-frame-pointer2019081020190803
3860608refgcc -m32 -march=k8 -O2 -fomit-frame-pointer2019081020190803
3904508refgcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer2019081020190803
3905968refgcc -m32 -march=k8 -Os -fomit-frame-pointer2019081020190803

Compiler output

Implementation: crypto_kem/kyber1024/avx2
Compiler: gcc -funroll-loops -m32 -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 156, namely:
CompilerImplementations
gcc -funroll-loops -m32 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer avx2
gcc -m32 -O2 -fomit-frame-pointer avx2
gcc -m32 -O3 -fomit-frame-pointer avx2
gcc -m32 -O -fomit-frame-pointer avx2
gcc -m32 -Os -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -m32 -march=prescott -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_kem/kyber1024/avx2
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_kem/kyber1024/avx2
Compiler: gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:142:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+2], lanes2 ),\
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_kem/kyber1024/avx2
Compiler: gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer
basemul.S: basemul.S: Assembler messages:
basemul.S: basemul.S:79: Error: bad register name `%rip)'
basemul.S: basemul.S:80: Error: bad register name `%rip)'
basemul.S: basemul.S:81: Error: bad register name `%rcx)'
basemul.S: basemul.S:84: Error: bad register name `%rsi)'
basemul.S: basemul.S:84: Error: bad register name `%rdx)'
basemul.S: basemul.S:84: Error: bad register name `%rsi)'
basemul.S: basemul.S:84: Error: bad register name `%rdx)'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: basemul.S:84: Error: bad register name `%ymm10'
basemul.S: basemul.S:84: Error: bad register name `%ymm10'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm13'
basemul.S: basemul.S:84: Error: bad register name `%ymm13'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer avx2