Implementation notes: x86, samba, crypto_kem/kyber768

Computer: samba
Architecture: x86
CPU ID: GenuineIntel-000506e3-bfebfbff
SUPERCOP version: 20190803
Operation: crypto_kem
Primitive: kyber768
TimeImplementationCompilerBenchmark dateSUPERCOP version
981468refgcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer2019081020190803
1012714refgcc -m32 -march=core2 -O3 -fomit-frame-pointer2019081020190803
1023329refgcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer2019081020190803
1025528refgcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer2019081020190803
1031280refgcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2019081020190803
1034778refgcc -m32 -march=pentium-m -O3 -fomit-frame-pointer2019081020190803
1040151refgcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer2019081020190803
1043100refgcc -m32 -march=corei7 -O3 -fomit-frame-pointer2019081020190803
1044615refgcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2019081020190803
1049691refgcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer2019081020190803
1052111refgcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2019081020190803
1052325refgcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2019081020190803
1052338refgcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2019081020190803
1058839refgcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer2019081020190803
1067901refgcc -m32 -march=corei7 -O2 -fomit-frame-pointer2019081020190803
1069970refgcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer2019081020190803
1073494refgcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer2019081020190803
1074204refgcc -m32 -march=core-avx2 -Os -fomit-frame-pointer2019081020190803
1079169refgcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2019081020190803
1081575refgcc -m32 -march=pentium-m -O2 -fomit-frame-pointer2019081020190803
1088594refgcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer2019081020190803
1090304refgcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer2019081020190803
1094254refgcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer2019081020190803
1096440refgcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer2019081020190803
1102689refgcc -m32 -march=core-avx-i -Os -fomit-frame-pointer2019081020190803
1104190refgcc -m32 -march=corei7 -Os -fomit-frame-pointer2019081020190803
1106861refgcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer2019081020190803
1107451refgcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer2019081020190803
1109425refgcc -m32 -march=core2 -O2 -fomit-frame-pointer2019081020190803
1110075refgcc -m32 -march=corei7-avx -Os -fomit-frame-pointer2019081020190803
1125359refgcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2019081020190803
1128778refgcc -m32 -march=pentium-m -Os -fomit-frame-pointer2019081020190803
1131684refgcc -m32 -march=pentium4 -Os -fomit-frame-pointer2019081020190803
1138752refgcc -m32 -march=prescott -Os -fomit-frame-pointer2019081020190803
1140317refgcc -m32 -march=core2 -Os -fomit-frame-pointer2019081020190803
1142651refgcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer2019081020190803
1144655refgcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer2019081020190803
1146186refgcc -m32 -march=pentium-m -O -fomit-frame-pointer2019081020190803
1146296refgcc -m32 -march=nocona -Os -fomit-frame-pointer2019081020190803
1146367refgcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer2019081020190803
1146900refgcc -m32 -march=core-avx2 -O -fomit-frame-pointer2019081020190803
1147841refgcc -m32 -march=k8 -O -fomit-frame-pointer2019081020190803
1148853refgcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer2019081020190803
1157857refgcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer2019081020190803
1171853refgcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer2019081020190803
1181613refgcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer2019081020190803
1181930refgcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer2019081020190803
1182111refgcc -m32 -march=corei7 -O -fomit-frame-pointer2019081020190803
1182684refgcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer2019081020190803
1191056refgcc -m32 -march=prescott -O3 -fomit-frame-pointer2019081020190803
1192077refgcc -m32 -march=nocona -O3 -fomit-frame-pointer2019081020190803
1192831refgcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer2019081020190803
1193327refgcc -m32 -march=barcelona -O -fomit-frame-pointer2019081020190803
1198478refgcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer2019081020190803
1198960refgcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer2019081020190803
1201369refgcc -m32 -march=core2 -O -fomit-frame-pointer2019081020190803
1204369refgcc -m32 -march=core-avx-i -O -fomit-frame-pointer2019081020190803
1205621refgcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer2019081020190803
1207391refgcc -m32 -march=corei7-avx -O -fomit-frame-pointer2019081020190803
1209978refgcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer2019081020190803
1220313refgcc -funroll-loops -m32 -O3 -fomit-frame-pointer2019081020190803
1228327refgcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer2019081020190803
1231309refgcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer2019081020190803
1233614refgcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer2019081020190803
1234982refgcc -m32 -march=athlon -O3 -fomit-frame-pointer2019081020190803
1235909refgcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer2019081020190803
1240692refgcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer2019081020190803
1240933refgcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer2019081020190803
1244690refgcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer2019081020190803
1244903refgcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer2019081020190803
1245431refgcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer2019081020190803
1246308refgcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer2019081020190803
1250522refgcc -funroll-loops -m32 -O -fomit-frame-pointer2019081020190803
1254299refgcc -funroll-loops -m32 -O2 -fomit-frame-pointer2019081020190803
1254309refgcc -m32 -march=nocona -O2 -fomit-frame-pointer2019081020190803
1257840refgcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer2019081020190803
1259107refgcc -m32 -O3 -fomit-frame-pointer2019081020190803
1259205refgcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019081020190803
1262000refgcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer2019081020190803
1263982refgcc -m32 -march=prescott -O2 -fomit-frame-pointer2019081020190803
1265280refgcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer2019081020190803
1269292refgcc -m32 -march=nocona -O -fomit-frame-pointer2019081020190803
1269302refgcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer2019081020190803
1271744refgcc -m32 -march=prescott -O -fomit-frame-pointer2019081020190803
1271863refgcc -m32 -march=k6-3 -O3 -fomit-frame-pointer2019081020190803
1273292refgcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer2019081020190803
1274226refgcc -m32 -march=pentium4 -O3 -fomit-frame-pointer2019081020190803
1275967refgcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer2019081020190803
1276308refgcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer2019081020190803
1276895refgcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer2019081020190803
1277504refgcc -m32 -march=k6-2 -O3 -fomit-frame-pointer2019081020190803
1281631refgcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019081020190803
1284131refgcc -m32 -march=pentium2 -O3 -fomit-frame-pointer2019081020190803
1284805refgcc -m32 -march=pentium3 -O3 -fomit-frame-pointer2019081020190803
1285331refgcc -m32 -march=pentiumpro -O -fomit-frame-pointer2019081020190803
1289460refgcc -m32 -march=pentium2 -O -fomit-frame-pointer2019081020190803
1291752refgcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer2019081020190803
1292308refgcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer2019081020190803
1300627refgcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer2019081020190803
1303756refgcc -m32 -march=k6 -Os -fomit-frame-pointer2019081020190803
1303807refgcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019081020190803
1305514refgcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer2019081020190803
1308131refgcc -m32 -march=pentium3 -O -fomit-frame-pointer2019081020190803
1309566refgcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer2019081020190803
1311033refgcc -m32 -march=i386 -Os -fomit-frame-pointer2019081020190803
1313012refgcc -m32 -march=athlon -O -fomit-frame-pointer2019081020190803
1314082refgcc -m32 -march=k6-2 -Os -fomit-frame-pointer2019081020190803
1315734refgcc -m32 -march=k6-3 -Os -fomit-frame-pointer2019081020190803
1316966refgcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer2019081020190803
1317536refgcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer2019081020190803
1317680refgcc -m32 -Os -fomit-frame-pointer2019081020190803
1318270refgcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019081020190803
1322447refgcc -m32 -march=i486 -Os -fomit-frame-pointer2019081020190803
1328030refgcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer2019081020190803
1329714refgcc -funroll-loops -m32 -Os -fomit-frame-pointer2019081020190803
1330585refgcc -m32 -march=athlon -Os -fomit-frame-pointer2019081020190803
1331274refgcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019081020190803
1332982refgcc -m32 -march=pentiumpro -Os -fomit-frame-pointer2019081020190803
1333785refgcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer2019081020190803
1334041refgcc -m32 -march=k6 -O3 -fomit-frame-pointer2019081020190803
1335413refgcc -m32 -march=pentium -Os -fomit-frame-pointer2019081020190803
1337155refgcc -m32 -march=pentium2 -Os -fomit-frame-pointer2019081020190803
1339145refgcc -m32 -march=pentium3 -Os -fomit-frame-pointer2019081020190803
1339730refgcc -m32 -march=athlon -O2 -fomit-frame-pointer2019081020190803
1342216refgcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer2019081020190803
1348422refgcc -m32 -march=pentium4 -O2 -fomit-frame-pointer2019081020190803
1348835refgcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer2019081020190803
1349592refgcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer2019081020190803
1354215refgcc -m32 -march=pentium4 -O -fomit-frame-pointer2019081020190803
1358635refgcc -m32 -O -fomit-frame-pointer2019081020190803
1365218refgcc -m32 -O2 -fomit-frame-pointer2019081020190803
1370442refgcc -m32 -march=pentium3 -O2 -fomit-frame-pointer2019081020190803
1375865refgcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019081020190803
1383410refgcc -m32 -march=k6-2 -O -fomit-frame-pointer2019081020190803
1384028refgcc -m32 -march=k6-3 -O -fomit-frame-pointer2019081020190803
1384659refgcc -m32 -march=pentium2 -O2 -fomit-frame-pointer2019081020190803
1387474refgcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer2019081020190803
1395240refgcc -m32 -march=k6 -O -fomit-frame-pointer2019081020190803
1395518refgcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer2019081020190803
1396843refgcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer2019081020190803
1411051refgcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer2019081020190803
1417700refgcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer2019081020190803
1418190refgcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer2019081020190803
1424780refgcc -m32 -march=k6-3 -O2 -fomit-frame-pointer2019081020190803
1428667refgcc -m32 -march=k6 -O2 -fomit-frame-pointer2019081020190803
1433073refgcc -m32 -march=k6-2 -O2 -fomit-frame-pointer2019081020190803
1438522refgcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer2019081020190803
1444081refgcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer2019081020190803
1456559refgcc -m32 -march=i386 -O3 -fomit-frame-pointer2019081020190803
1473787refgcc -m32 -march=i486 -O3 -fomit-frame-pointer2019081020190803
1490714refgcc -m32 -march=pentium-mmx -O -fomit-frame-pointer2019081020190803
1502982refgcc -m32 -march=i386 -O -fomit-frame-pointer2019081020190803
1506883refgcc -m32 -march=pentium -O -fomit-frame-pointer2019081020190803
1551843refgcc -m32 -march=i486 -O2 -fomit-frame-pointer2019081020190803
1555837refgcc -m32 -march=i386 -O2 -fomit-frame-pointer2019081020190803
1557696refgcc -m32 -march=i486 -O -fomit-frame-pointer2019081020190803
1564633refgcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019081020190803
1583184refgcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer2019081020190803
1583454refgcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019081020190803
1599903refgcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer2019081020190803
1633780refgcc -m32 -march=pentium -O3 -fomit-frame-pointer2019081020190803
1643208refgcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019081020190803
1676101refgcc -m32 -march=pentium -O2 -fomit-frame-pointer2019081020190803
1689419refgcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019081020190803
2258861refgcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer2019081020190803
2311889refgcc -m32 -march=barcelona -O3 -fomit-frame-pointer2019081020190803
2357275refgcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer2019081020190803
2360085refgcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer2019081020190803
2397351refgcc -m32 -march=barcelona -Os -fomit-frame-pointer2019081020190803
2404960refgcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer2019081020190803
2410673refgcc -m32 -march=k8 -O3 -fomit-frame-pointer2019081020190803
2413495refgcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer2019081020190803
2483495refgcc -m32 -march=barcelona -O2 -fomit-frame-pointer2019081020190803
2493834refgcc -m32 -march=k8 -Os -fomit-frame-pointer2019081020190803
2499494refgcc -m32 -march=k8 -O2 -fomit-frame-pointer2019081020190803
2515976refgcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer2019081020190803

Compiler output

Implementation: crypto_kem/kyber768/avx2
Compiler: gcc -funroll-loops -m32 -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 156, namely:
CompilerImplementations
gcc -funroll-loops -m32 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer avx2
gcc -m32 -O2 -fomit-frame-pointer avx2
gcc -m32 -O3 -fomit-frame-pointer avx2
gcc -m32 -O -fomit-frame-pointer avx2
gcc -m32 -Os -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -m32 -march=prescott -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_kem/kyber768/avx2
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_kem/kyber768/avx2
Compiler: gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:142:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+2], lanes2 ),\
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_kem/kyber768/avx2
Compiler: gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer
basemul.S: basemul.S: Assembler messages:
basemul.S: basemul.S:79: Error: bad register name `%rip)'
basemul.S: basemul.S:80: Error: bad register name `%rip)'
basemul.S: basemul.S:81: Error: bad register name `%rcx)'
basemul.S: basemul.S:84: Error: bad register name `%rsi)'
basemul.S: basemul.S:84: Error: bad register name `%rdx)'
basemul.S: basemul.S:84: Error: bad register name `%rsi)'
basemul.S: basemul.S:84: Error: bad register name `%rdx)'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: basemul.S:84: Error: bad register name `%ymm10'
basemul.S: basemul.S:84: Error: bad register name `%ymm10'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm13'
basemul.S: basemul.S:84: Error: bad register name `%ymm13'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer avx2