Implementation notes: x86, samba, crypto_kem/kyber1024

Computer: samba
Architecture: x86
CPU ID: GenuineIntel-000506e3-bfebfbff
SUPERCOP version: 20190803
Operation: crypto_kem
Primitive: kyber1024
TimeImplementationCompilerBenchmark dateSUPERCOP version
1480881refgcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer2019081020190803
1525601refgcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer2019081020190803
1529954refgcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2019081020190803
1534627refgcc -m32 -march=pentium-m -O3 -fomit-frame-pointer2019081020190803
1539149refgcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer2019081020190803
1550508refgcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer2019081020190803
1552554refgcc -m32 -march=core2 -O3 -fomit-frame-pointer2019081020190803
1558880refgcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2019081020190803
1559095refgcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2019081020190803
1559218refgcc -m32 -march=corei7 -O3 -fomit-frame-pointer2019081020190803
1561679refgcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer2019081020190803
1566581refgcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2019081020190803
1568294refgcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2019081020190803
1583988refgcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer2019081020190803
1603906refgcc -m32 -march=core-avx2 -Os -fomit-frame-pointer2019081020190803
1609651refgcc -m32 -march=pentium-m -O2 -fomit-frame-pointer2019081020190803
1613760refgcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2019081020190803
1614415refgcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer2019081020190803
1616387refgcc -m32 -march=corei7 -O2 -fomit-frame-pointer2019081020190803
1624760refgcc -m32 -march=core-avx-i -Os -fomit-frame-pointer2019081020190803
1626962refgcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer2019081020190803
1627443refgcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer2019081020190803
1632479refgcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer2019081020190803
1635276refgcc -m32 -march=corei7-avx -Os -fomit-frame-pointer2019081020190803
1640246refgcc -m32 -march=corei7 -Os -fomit-frame-pointer2019081020190803
1643916refgcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer2019081020190803
1655583refgcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer2019081020190803
1659012refgcc -m32 -march=nocona -Os -fomit-frame-pointer2019081020190803
1662582refgcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer2019081020190803
1669483refgcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer2019081020190803
1673552refgcc -m32 -march=core2 -Os -fomit-frame-pointer2019081020190803
1673627refgcc -m32 -march=pentium-m -Os -fomit-frame-pointer2019081020190803
1673863refgcc -m32 -march=prescott -Os -fomit-frame-pointer2019081020190803
1678433refgcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer2019081020190803
1681744refgcc -m32 -march=pentium4 -Os -fomit-frame-pointer2019081020190803
1697893refgcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer2019081020190803
1702447refgcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer2019081020190803
1702867refgcc -m32 -march=core2 -O2 -fomit-frame-pointer2019081020190803
1718077refgcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer2019081020190803
1725271refgcc -m32 -march=core-avx2 -O -fomit-frame-pointer2019081020190803
1729123refgcc -m32 -march=k8 -O -fomit-frame-pointer2019081020190803
1735434refgcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2019081020190803
1737248refgcc -m32 -march=pentium-m -O -fomit-frame-pointer2019081020190803
1744835refgcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer2019081020190803
1749703refgcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer2019081020190803
1756149refgcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer2019081020190803
1760540refgcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer2019081020190803
1765170refgcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer2019081020190803
1778778refgcc -m32 -march=corei7 -O -fomit-frame-pointer2019081020190803
1781719refgcc -m32 -march=prescott -O3 -fomit-frame-pointer2019081020190803
1787903refgcc -m32 -march=barcelona -O -fomit-frame-pointer2019081020190803
1789133refgcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer2019081020190803
1794152refgcc -m32 -march=core2 -O -fomit-frame-pointer2019081020190803
1801285refgcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer2019081020190803
1804207refgcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer2019081020190803
1806046refgcc -m32 -march=core-avx-i -O -fomit-frame-pointer2019081020190803
1806063refgcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer2019081020190803
1823319refgcc -m32 -march=corei7-avx -O -fomit-frame-pointer2019081020190803
1824971refgcc -m32 -march=nocona -O3 -fomit-frame-pointer2019081020190803
1828897refgcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer2019081020190803
1830659refgcc -funroll-loops -m32 -O3 -fomit-frame-pointer2019081020190803
1846670refgcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer2019081020190803
1857235refgcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer2019081020190803
1862847refgcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer2019081020190803
1868077refgcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer2019081020190803
1870673refgcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer2019081020190803
1874068refgcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer2019081020190803
1880216refgcc -m32 -march=athlon -O3 -fomit-frame-pointer2019081020190803
1884337refgcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer2019081020190803
1887563refgcc -funroll-loops -m32 -O -fomit-frame-pointer2019081020190803
1888530refgcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer2019081020190803
1889355refgcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer2019081020190803
1892423refgcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer2019081020190803
1893553refgcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer2019081020190803
1894171refgcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer2019081020190803
1896726refgcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer2019081020190803
1899440refgcc -m32 -march=nocona -O2 -fomit-frame-pointer2019081020190803
1906256refgcc -m32 -march=pentium4 -O3 -fomit-frame-pointer2019081020190803
1910080refgcc -m32 -march=nocona -O -fomit-frame-pointer2019081020190803
1910417refgcc -m32 -march=prescott -O2 -fomit-frame-pointer2019081020190803
1910717refgcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer2019081020190803
1912551refgcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer2019081020190803
1914032refgcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019081020190803
1917164refgcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer2019081020190803
1919232refgcc -funroll-loops -m32 -O2 -fomit-frame-pointer2019081020190803
1923861refgcc -m32 -march=prescott -O -fomit-frame-pointer2019081020190803
1924649refgcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer2019081020190803
1927099refgcc -m32 -O3 -fomit-frame-pointer2019081020190803
1931520refgcc -m32 -march=k6-3 -O3 -fomit-frame-pointer2019081020190803
1932302refgcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer2019081020190803
1932952refgcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer2019081020190803
1935430refgcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer2019081020190803
1938723refgcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019081020190803
1943200refgcc -m32 -march=pentium3 -O3 -fomit-frame-pointer2019081020190803
1948238refgcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer2019081020190803
1951288refgcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer2019081020190803
1952594refgcc -m32 -march=k6-2 -Os -fomit-frame-pointer2019081020190803
1957151refgcc -m32 -march=k6-3 -Os -fomit-frame-pointer2019081020190803
1958720refgcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer2019081020190803
1966379refgcc -m32 -march=k6 -Os -fomit-frame-pointer2019081020190803
1968736refgcc -m32 -march=k6-2 -O3 -fomit-frame-pointer2019081020190803
1969012refgcc -funroll-loops -m32 -Os -fomit-frame-pointer2019081020190803
1972069refgcc -m32 -march=athlon -O -fomit-frame-pointer2019081020190803
1973351refgcc -m32 -march=pentium2 -O -fomit-frame-pointer2019081020190803
1977722refgcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer2019081020190803
1979395refgcc -m32 -march=i386 -Os -fomit-frame-pointer2019081020190803
1980941refgcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer2019081020190803
1981387refgcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer2019081020190803
1982110refgcc -m32 -march=i486 -Os -fomit-frame-pointer2019081020190803
1984340refgcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019081020190803
1987796refgcc -m32 -march=k6 -O3 -fomit-frame-pointer2019081020190803
1990424refgcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer2019081020190803
1990692refgcc -m32 -march=pentiumpro -O -fomit-frame-pointer2019081020190803
1993636refgcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019081020190803
1995640refgcc -m32 -Os -fomit-frame-pointer2019081020190803
1996114refgcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019081020190803
1996688refgcc -m32 -march=pentium2 -O3 -fomit-frame-pointer2019081020190803
2002373refgcc -m32 -march=pentium -Os -fomit-frame-pointer2019081020190803
2009909refgcc -m32 -march=pentium4 -O -fomit-frame-pointer2019081020190803
2010204refgcc -m32 -march=pentium3 -O -fomit-frame-pointer2019081020190803
2013931refgcc -m32 -march=pentium3 -Os -fomit-frame-pointer2019081020190803
2018572refgcc -m32 -march=pentium2 -Os -fomit-frame-pointer2019081020190803
2022117refgcc -m32 -march=athlon -Os -fomit-frame-pointer2019081020190803
2024528refgcc -m32 -march=pentiumpro -Os -fomit-frame-pointer2019081020190803
2030028refgcc -m32 -march=pentium4 -O2 -fomit-frame-pointer2019081020190803
2031549refgcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer2019081020190803
2034123refgcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer2019081020190803
2040583refgcc -m32 -march=athlon -O2 -fomit-frame-pointer2019081020190803
2042903refgcc -m32 -O2 -fomit-frame-pointer2019081020190803
2047409refgcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer2019081020190803
2056143refgcc -m32 -O -fomit-frame-pointer2019081020190803
2081373refgcc -m32 -march=k6 -O -fomit-frame-pointer2019081020190803
2085407refgcc -m32 -march=k6-3 -O -fomit-frame-pointer2019081020190803
2086228refgcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019081020190803
2087121refgcc -m32 -march=pentium2 -O2 -fomit-frame-pointer2019081020190803
2091990refgcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer2019081020190803
2096570refgcc -m32 -march=pentium3 -O2 -fomit-frame-pointer2019081020190803
2097716refgcc -m32 -march=k6-2 -O -fomit-frame-pointer2019081020190803
2107069refgcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer2019081020190803
2117518refgcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer2019081020190803
2122482refgcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer2019081020190803
2124304refgcc -m32 -march=k6 -O2 -fomit-frame-pointer2019081020190803
2128887refgcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer2019081020190803
2131067refgcc -m32 -march=k6-2 -O2 -fomit-frame-pointer2019081020190803
2133338refgcc -m32 -march=k6-3 -O2 -fomit-frame-pointer2019081020190803
2144270refgcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer2019081020190803
2152482refgcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer2019081020190803
2159151refgcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer2019081020190803
2210008refgcc -m32 -march=i386 -O3 -fomit-frame-pointer2019081020190803
2241230refgcc -m32 -march=i486 -O3 -fomit-frame-pointer2019081020190803
2249482refgcc -m32 -march=pentium -O -fomit-frame-pointer2019081020190803
2259325refgcc -m32 -march=pentium-mmx -O -fomit-frame-pointer2019081020190803
2285000refgcc -m32 -march=i386 -O -fomit-frame-pointer2019081020190803
2294797refgcc -m32 -march=i486 -O -fomit-frame-pointer2019081020190803
2299752refgcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019081020190803
2305498refgcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer2019081020190803
2313195refgcc -m32 -march=i386 -O2 -fomit-frame-pointer2019081020190803
2314343refgcc -m32 -march=i486 -O2 -fomit-frame-pointer2019081020190803
2349137refgcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019081020190803
2376923refgcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer2019081020190803
2412351refgcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019081020190803
2418823refgcc -m32 -march=pentium -O3 -fomit-frame-pointer2019081020190803
2460234refgcc -m32 -march=pentium -O2 -fomit-frame-pointer2019081020190803
2463855refgcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019081020190803
3533075refgcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer2019081020190803
3594102refgcc -m32 -march=barcelona -O3 -fomit-frame-pointer2019081020190803
3640216refgcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer2019081020190803
3661726refgcc -m32 -march=barcelona -Os -fomit-frame-pointer2019081020190803
3673619refgcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer2019081020190803
3676625refgcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer2019081020190803
3717881refgcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer2019081020190803
3751873refgcc -m32 -march=k8 -O3 -fomit-frame-pointer2019081020190803
3822411refgcc -m32 -march=k8 -Os -fomit-frame-pointer2019081020190803
3835565refgcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer2019081020190803
3852036refgcc -m32 -march=barcelona -O2 -fomit-frame-pointer2019081020190803
3884782refgcc -m32 -march=k8 -O2 -fomit-frame-pointer2019081020190803

Compiler output

Implementation: crypto_kem/kyber1024/avx2
Compiler: gcc -funroll-loops -m32 -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 156, namely:
CompilerImplementations
gcc -funroll-loops -m32 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer avx2
gcc -m32 -O2 -fomit-frame-pointer avx2
gcc -m32 -O3 -fomit-frame-pointer avx2
gcc -m32 -O -fomit-frame-pointer avx2
gcc -m32 -Os -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -m32 -march=prescott -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_kem/kyber1024/avx2
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_kem/kyber1024/avx2
Compiler: gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:142:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+2], lanes2 ),\
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_kem/kyber1024/avx2
Compiler: gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer
basemul.S: basemul.S: Assembler messages:
basemul.S: basemul.S:79: Error: bad register name `%rip)'
basemul.S: basemul.S:80: Error: bad register name `%rip)'
basemul.S: basemul.S:81: Error: bad register name `%rcx)'
basemul.S: basemul.S:84: Error: bad register name `%rsi)'
basemul.S: basemul.S:84: Error: bad register name `%rdx)'
basemul.S: basemul.S:84: Error: bad register name `%rsi)'
basemul.S: basemul.S:84: Error: bad register name `%rdx)'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: basemul.S:84: Error: bad register name `%ymm10'
basemul.S: basemul.S:84: Error: bad register name `%ymm10'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm13'
basemul.S: basemul.S:84: Error: bad register name `%ymm13'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer avx2