Implementation notes: x86, titan0, crypto_kem/kyber768

Computer: titan0
Architecture: x86
CPU ID: GenuineIntel-000306c3-bfebfbff
SUPERCOP version: 20190803
Operation: crypto_kem
Primitive: kyber768
TimeImplementationCompilerBenchmark dateSUPERCOP version
1019236refgcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer2019081020190803
1045992refgcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2019081020190803
1048376refgcc -m32 -march=pentium-m -O3 -fomit-frame-pointer2019081020190803
1054084refgcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer2019081020190803
1056300refgcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer2019081020190803
1060832refgcc -m32 -march=corei7 -O3 -fomit-frame-pointer2019081020190803
1062452refgcc -m32 -march=core2 -O3 -fomit-frame-pointer2019081020190803
1094096refgcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2019081020190803
1099064refgcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2019081020190803
1109252refgcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer2019081020190803
1113716refgcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2019081020190803
1115636refgcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2019081020190803
1121940refgcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer2019081020190803
1122704refgcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer2019081020190803
1131604refgcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2019081020190803
1133824refgcc -m32 -march=pentium-m -O2 -fomit-frame-pointer2019081020190803
1136600refgcc -m32 -march=corei7 -O2 -fomit-frame-pointer2019081020190803
1138852refgcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer2019081020190803
1141652refgcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer2019081020190803
1144384refgcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer2019081020190803
1147448refgcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer2019081020190803
1156784refgcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer2019081020190803
1167176refgcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer2019081020190803
1172388refgcc -m32 -march=core2 -O2 -fomit-frame-pointer2019081020190803
1177232refgcc -m32 -march=core-avx2 -Os -fomit-frame-pointer2019081020190803
1191176refgcc -m32 -march=core-avx-i -Os -fomit-frame-pointer2019081020190803
1194084refgcc -m32 -march=corei7-avx -Os -fomit-frame-pointer2019081020190803
1196320refgcc -m32 -march=core-avx2 -O -fomit-frame-pointer2019081020190803
1197000refgcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2019081020190803
1208008refgcc -m32 -march=pentium-m -O -fomit-frame-pointer2019081020190803
1225804refgcc -m32 -march=k8 -O -fomit-frame-pointer2019081020190803
1249152refgcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer2019081020190803
1254116refgcc -m32 -march=barcelona -O -fomit-frame-pointer2019081020190803
1256620refgcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer2019081020190803
1258320refgcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer2019081020190803
1258424refgcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer2019081020190803
1261500refgcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer2019081020190803
1265812refgcc -m32 -march=corei7-avx -O -fomit-frame-pointer2019081020190803
1267056refgcc -m32 -march=core-avx-i -O -fomit-frame-pointer2019081020190803
1267132refgcc -m32 -march=core2 -O -fomit-frame-pointer2019081020190803
1268740refgcc -m32 -march=corei7 -O -fomit-frame-pointer2019081020190803
1269236refgcc -m32 -march=prescott -O3 -fomit-frame-pointer2019081020190803
1269428refgcc -m32 -march=nocona -O3 -fomit-frame-pointer2019081020190803
1270688refgcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer2019081020190803
1273336refgcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer2019081020190803
1278032refgcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer2019081020190803
1281408refgcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer2019081020190803
1282800refgcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer2019081020190803
1283248refgcc -funroll-loops -m32 -O3 -fomit-frame-pointer2019081020190803
1284288refgcc -m32 -march=athlon -O3 -fomit-frame-pointer2019081020190803
1289948refgcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019081020190803
1291116refgcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer2019081020190803
1291848refgcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer2019081020190803
1292760refgcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer2019081020190803
1295112refgcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer2019081020190803
1300604refgcc -funroll-loops -m32 -O2 -fomit-frame-pointer2019081020190803
1303984refgcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer2019081020190803
1304664refgcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer2019081020190803
1309092refgcc -funroll-loops -m32 -O -fomit-frame-pointer2019081020190803
1318212refgcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019081020190803
1322952refgcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer2019081020190803
1324568refgcc -m32 -O3 -fomit-frame-pointer2019081020190803
1325336refgcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer2019081020190803
1326532refgcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer2019081020190803
1331804refgcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer2019081020190803
1332388refgcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer2019081020190803
1336460refgcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer2019081020190803
1338004refgcc -m32 -march=pentium2 -O3 -fomit-frame-pointer2019081020190803
1342484refgcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer2019081020190803
1347468refgcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer2019081020190803
1351096refgcc -m32 -march=pentium3 -O3 -fomit-frame-pointer2019081020190803
1351124refgcc -m32 -march=k6-3 -O3 -fomit-frame-pointer2019081020190803
1352520refgcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer2019081020190803
1354052refgcc -m32 -march=prescott -O2 -fomit-frame-pointer2019081020190803
1355384refgcc -m32 -march=nocona -O2 -fomit-frame-pointer2019081020190803
1356736refgcc -m32 -march=k6-2 -O3 -fomit-frame-pointer2019081020190803
1359592refgcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019081020190803
1363516refgcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer2019081020190803
1366732refgcc -m32 -march=core2 -Os -fomit-frame-pointer2019081020190803
1369788refgcc -m32 -march=nocona -O -fomit-frame-pointer2019081020190803
1370192refgcc -m32 -march=pentium4 -O3 -fomit-frame-pointer2019081020190803
1370516refgcc -m32 -march=pentium-m -Os -fomit-frame-pointer2019081020190803
1371976refgcc -m32 -march=k6 -O3 -fomit-frame-pointer2019081020190803
1372536refgcc -m32 -march=prescott -Os -fomit-frame-pointer2019081020190803
1373836refgcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer2019081020190803
1374888refgcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer2019081020190803
1376188refgcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer2019081020190803
1376304refgcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer2019081020190803
1377016refgcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer2019081020190803
1377360refgcc -m32 -march=prescott -O -fomit-frame-pointer2019081020190803
1377732refgcc -m32 -march=nocona -Os -fomit-frame-pointer2019081020190803
1378600refgcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer2019081020190803
1379672refgcc -m32 -march=pentiumpro -O -fomit-frame-pointer2019081020190803
1381628refgcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer2019081020190803
1382080refgcc -m32 -march=corei7 -Os -fomit-frame-pointer2019081020190803
1382680refgcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer2019081020190803
1382832refgcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer2019081020190803
1382864refgcc -m32 -march=pentium3 -O -fomit-frame-pointer2019081020190803
1386648refgcc -m32 -march=pentium2 -O -fomit-frame-pointer2019081020190803
1387160refgcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer2019081020190803
1387888refgcc -m32 -march=pentium4 -Os -fomit-frame-pointer2019081020190803
1393812refgcc -m32 -march=athlon -O -fomit-frame-pointer2019081020190803
1411828refgcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer2019081020190803
1413320refgcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer2019081020190803
1418520refgcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer2019081020190803
1419920refgcc -m32 -march=k6 -Os -fomit-frame-pointer2019081020190803
1425180refgcc -m32 -march=k6-3 -Os -fomit-frame-pointer2019081020190803
1430640refgcc -m32 -march=k6-2 -Os -fomit-frame-pointer2019081020190803
1433164refgcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019081020190803
1436388refgcc -funroll-loops -m32 -Os -fomit-frame-pointer2019081020190803
1436768refgcc -m32 -march=athlon -Os -fomit-frame-pointer2019081020190803
1438924refgcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer2019081020190803
1441072refgcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019081020190803
1441632refgcc -m32 -march=pentium3 -O2 -fomit-frame-pointer2019081020190803
1442644refgcc -m32 -march=i486 -Os -fomit-frame-pointer2019081020190803
1444228refgcc -m32 -march=i386 -Os -fomit-frame-pointer2019081020190803
1444344refgcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer2019081020190803
1445560refgcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019081020190803
1446496refgcc -m32 -march=athlon -O2 -fomit-frame-pointer2019081020190803
1447048refgcc -m32 -O -fomit-frame-pointer2019081020190803
1448040refgcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer2019081020190803
1448980refgcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer2019081020190803
1452468refgcc -m32 -Os -fomit-frame-pointer2019081020190803
1454432refgcc -m32 -march=pentium2 -O2 -fomit-frame-pointer2019081020190803
1455516refgcc -m32 -O2 -fomit-frame-pointer2019081020190803
1460740refgcc -m32 -march=pentium3 -Os -fomit-frame-pointer2019081020190803
1462176refgcc -m32 -march=pentium -Os -fomit-frame-pointer2019081020190803
1463472refgcc -m32 -march=pentiumpro -Os -fomit-frame-pointer2019081020190803
1463988refgcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer2019081020190803
1469164refgcc -m32 -march=pentium2 -Os -fomit-frame-pointer2019081020190803
1472048refgcc -m32 -march=k6 -O -fomit-frame-pointer2019081020190803
1477432refgcc -m32 -march=pentium4 -O -fomit-frame-pointer2019081020190803
1478420refgcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer2019081020190803
1479132refgcc -m32 -march=k6-2 -O -fomit-frame-pointer2019081020190803
1479228refgcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer2019081020190803
1483620refgcc -m32 -march=k6-3 -O -fomit-frame-pointer2019081020190803
1490144refgcc -m32 -march=pentium4 -O2 -fomit-frame-pointer2019081020190803
1508660refgcc -m32 -march=k6 -O2 -fomit-frame-pointer2019081020190803
1509884refgcc -m32 -march=k6-3 -O2 -fomit-frame-pointer2019081020190803
1516972refgcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer2019081020190803
1518248refgcc -m32 -march=k6-2 -O2 -fomit-frame-pointer2019081020190803
1535284refgcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer2019081020190803
1537436refgcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer2019081020190803
1543380refgcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer2019081020190803
1548732refgcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer2019081020190803
1559408refgcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer2019081020190803
1589052refgcc -m32 -march=i486 -O3 -fomit-frame-pointer2019081020190803
1592576refgcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer2019081020190803
1593740refgcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer2019081020190803
1607208refgcc -m32 -march=i386 -O3 -fomit-frame-pointer2019081020190803
1615804refgcc -m32 -march=pentium-mmx -O -fomit-frame-pointer2019081020190803
1625900refgcc -m32 -march=pentium -O -fomit-frame-pointer2019081020190803
1656072refgcc -m32 -march=i386 -O -fomit-frame-pointer2019081020190803
1666464refgcc -m32 -march=i486 -O -fomit-frame-pointer2019081020190803
1674600refgcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer2019081020190803
1682656refgcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019081020190803
1705000refgcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer2019081020190803
1705160refgcc -m32 -march=i386 -O2 -fomit-frame-pointer2019081020190803
1705536refgcc -m32 -march=i486 -O2 -fomit-frame-pointer2019081020190803
1709948refgcc -m32 -march=pentium -O3 -fomit-frame-pointer2019081020190803
1724616refgcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019081020190803
1726032refgcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019081020190803
1792056refgcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019081020190803
1804048refgcc -m32 -march=pentium -O2 -fomit-frame-pointer2019081020190803
2324900refgcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer2019081020190803
2367952refgcc -m32 -march=barcelona -O3 -fomit-frame-pointer2019081020190803
2401476refgcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer2019081020190803
2403264refgcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer2019081020190803
2421036refgcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer2019081020190803
2467308refgcc -m32 -march=k8 -O3 -fomit-frame-pointer2019081020190803
2476352refgcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer2019081020190803
2501816refgcc -m32 -march=k8 -O2 -fomit-frame-pointer2019081020190803
2509440refgcc -m32 -march=barcelona -Os -fomit-frame-pointer2019081020190803
2522184refgcc -m32 -march=barcelona -O2 -fomit-frame-pointer2019081020190803
2556992refgcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer2019081020190803
2562164refgcc -m32 -march=k8 -Os -fomit-frame-pointer2019081020190803

Compiler output

Implementation: crypto_kem/kyber768/avx2
Compiler: gcc -funroll-loops -m32 -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 156, namely:
CompilerImplementations
gcc -funroll-loops -m32 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer avx2
gcc -m32 -O2 -fomit-frame-pointer avx2
gcc -m32 -O3 -fomit-frame-pointer avx2
gcc -m32 -O -fomit-frame-pointer avx2
gcc -m32 -Os -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -m32 -march=prescott -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_kem/kyber768/avx2
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_kem/kyber768/avx2
Compiler: gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:142:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+2], lanes2 ),\
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_kem/kyber768/avx2
Compiler: gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer
basemul.S: basemul.S: Assembler messages:
basemul.S: basemul.S:79: Error: bad register name `%rip)'
basemul.S: basemul.S:80: Error: bad register name `%rip)'
basemul.S: basemul.S:81: Error: bad register name `%rcx)'
basemul.S: basemul.S:84: Error: bad register name `%rsi)'
basemul.S: basemul.S:84: Error: bad register name `%rdx)'
basemul.S: basemul.S:84: Error: bad register name `%rsi)'
basemul.S: basemul.S:84: Error: bad register name `%rdx)'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: basemul.S:84: Error: bad register name `%ymm10'
basemul.S: basemul.S:84: Error: bad register name `%ymm10'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm13'
basemul.S: basemul.S:84: Error: bad register name `%ymm13'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer avx2