Implementation notes: x86, samba, crypto_sign/dilithium4aes

Computer: samba
Architecture: x86
CPU ID: GenuineIntel-000506e3-bfebfbff
SUPERCOP version: 20190803
Operation: crypto_sign
Primitive: dilithium4aes
TimeImplementationCompilerBenchmark dateSUPERCOP version
5755512refgcc -m32 -march=core-avx2 -Os -fomit-frame-pointer2019080520190803
5777502refgcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer2019080520190803
5900712refgcc -m32 -march=corei7-avx -Os -fomit-frame-pointer2019080520190803
5924116refgcc -m32 -march=core-avx-i -Os -fomit-frame-pointer2019080520190803
5953687refgcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
6021706refgcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
6134914refgcc -funroll-loops -m32 -O3 -fomit-frame-pointer2019080520190803
6157646refgcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
6162323refgcc -m32 -march=core2 -Os -fomit-frame-pointer2019080520190803
6200962refgcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
6203661refgcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer2019080520190803
6211069refgcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
6212658refgcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
6213437refgcc -m32 -march=corei7 -Os -fomit-frame-pointer2019080520190803
6237436refgcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer2019080520190803
6241744refgcc -funroll-loops -m32 -O -fomit-frame-pointer2019080520190803
6249852refgcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
6252981refgcc -funroll-loops -m32 -O2 -fomit-frame-pointer2019080520190803
6253250refgcc -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
6253357refgcc -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
6261705refgcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
6263108refgcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
6264298refgcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
6286657refgcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
6292038refgcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
6313088refgcc -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
6318556refgcc -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
6324399refgcc -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
6330875refgcc -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
6331693refgcc -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
6339369refgcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
6347164refgcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
6361311refgcc -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
6361743refgcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
6387480refgcc -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
6391202refgcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
6395318refgcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
6415401refgcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
6415605refgcc -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
6418180refgcc -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
6422133refgcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
6445548refgcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
6449270refgcc -m32 -O3 -fomit-frame-pointer2019080520190803
6465246refgcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
6469117refgcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
6485030refgcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
6486569refgcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
6491943refgcc -m32 -O -fomit-frame-pointer2019080520190803
6513794refgcc -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
6524058refgcc -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
6529319refgcc -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
6530958refgcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
6550728refgcc -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
6552938refgcc -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
6603877refgcc -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
6624512refgcc -m32 -O2 -fomit-frame-pointer2019080520190803
6630952refgcc -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
6638911refgcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
6639347refgcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
6640135refgcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
6641653refgcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
6641718refgcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
6644070refgcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
6653274refgcc -funroll-loops -m32 -Os -fomit-frame-pointer2019080520190803
6655270refgcc -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
6655774refgcc -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
6656777refgcc -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
6657447refgcc -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
6661240refgcc -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
6662927refgcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
6663452refgcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
6689119refgcc -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
6692311refgcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
6692728refgcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
6692821refgcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
6692906refgcc -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
6692919refgcc -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
6693720refgcc -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
6695718refgcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
6710801refgcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
6737832refgcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
6751235refgcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
6752052refgcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
6764327refgcc -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
6770058refgcc -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
6771976refgcc -m32 -Os -fomit-frame-pointer2019080520190803
6796954refgcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
6804407refgcc -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
6816943refgcc -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
6841064refgcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
6848915refgcc -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
6853030refgcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
6860795refgcc -m32 -march=core2 -O3 -fomit-frame-pointer2019080520190803
6863048refgcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer2019080520190803
6869752refgcc -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
6875509refgcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2019080520190803
6879837refgcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
6899082refgcc -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
6900694refgcc -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
6902222refgcc -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
6915330refgcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
6924081refgcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
6924553refgcc -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
6940645refgcc -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
6955663refgcc -m32 -march=corei7-avx -O -fomit-frame-pointer2019080520190803
6968244refgcc -m32 -march=core-avx-i -O -fomit-frame-pointer2019080520190803
6976536refgcc -m32 -march=core2 -O -fomit-frame-pointer2019080520190803
6977595refgcc -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
6982587refgcc -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
6988296refgcc -m32 -march=corei7 -O3 -fomit-frame-pointer2019080520190803
6989532refgcc -m32 -march=core-avx2 -O -fomit-frame-pointer2019080520190803
6990042refgcc -m32 -march=corei7 -O -fomit-frame-pointer2019080520190803
6997883refgcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
7000454refgcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer2019080520190803
7000577refgcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2019080520190803
7005453refgcc -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
7007197refgcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
7009945refgcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
7016191refgcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
7023895refgcc -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
7034583refgcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer2019080520190803
7092768refgcc -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
7098252refgcc -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
7102010refgcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
7117520refgcc -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
7135411refgcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
7169259refgcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
7194702refgcc -m32 -march=corei7 -O2 -fomit-frame-pointer2019080520190803
7196369refgcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
7204369refgcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
7246270refgcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
7267195refgcc -m32 -march=core2 -O2 -fomit-frame-pointer2019080520190803
7268520refgcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
7294725refgcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2019080520190803
7295278refgcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer2019080520190803
7340085refgcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
7363899refgcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
7383162refgcc -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
7406133refgcc -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
7406813refgcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer2019080520190803
7418161refgcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer2019080520190803
7418764refgcc -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
7425014refgcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2019080520190803
7433560refgcc -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
7440617refgcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
7458285refgcc -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
7471974refgcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2019080520190803
7475984refgcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
7542901refgcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer2019080520190803
7556330refgcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer2019080520190803
7594688refgcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
7596329refgcc -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
7596526refgcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2019080520190803
7621887refgcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2019080520190803
7692780refgcc -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
7713464refgcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
7860153refgcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
7908373refgcc -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
7995520refgcc -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
8002034refgcc -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
9185465refgcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
9272927refgcc -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
9280859refgcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
9311847refgcc -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
9355508refgcc -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
9465659refgcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
9625724refgcc -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
9703575refgcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
10777782refgcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
11053149refgcc -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
11057198refgcc -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
11166037refgcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
12873395refgcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
12918139refgcc -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
13636823refgcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803
14190262refgcc -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803

Compiler output

Implementation: crypto_sign/dilithium4aes/avx2
Compiler: gcc -funroll-loops -m32 -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 156, namely:
CompilerImplementations
gcc -funroll-loops -m32 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer avx2
gcc -m32 -O2 -fomit-frame-pointer avx2
gcc -m32 -O3 -fomit-frame-pointer avx2
gcc -m32 -O -fomit-frame-pointer avx2
gcc -m32 -Os -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -m32 -march=prescott -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium4aes/avx2
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium4aes/avx2
Compiler: gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:142:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+2], lanes2 ),\
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium4aes/avx2
Compiler: gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer
invntt.s: invntt.s: Assembler messages:
invntt.s: invntt.s:47: Error: bad register name `%rip)'
invntt.s: invntt.s:48: Error: bad register name `%rip)'
invntt.s: invntt.s:49: Error: bad register name `%rip)'
invntt.s: invntt.s:52: Error: bad register name `%rsi)'
invntt.s: invntt.s:53: Error: bad register name `%rsi)'
invntt.s: invntt.s:54: Error: bad register name `%rsi)'
invntt.s: invntt.s:55: Error: bad register name `%rsi)'
invntt.s: invntt.s:58: Error: bad register name `%ymm8'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:66: Error: bad register name `%ymm8'
invntt.s: invntt.s:67: Error: bad register name `%ymm10'
invntt.s: invntt.s:70: Error: bad register name `%rdx)'
invntt.s: invntt.s:71: Error: bad register name `%rdx)'
invntt.s: invntt.s:72: Error: bad register name `%ymm12'
invntt.s: invntt.s:73: Error: bad register name `%ymm13'
invntt.s: invntt.s:74: Error: bad register name `%ymm8'
invntt.s: invntt.s:76: Error: bad register name `%ymm12'
invntt.s: invntt.s:77: Error: bad register name `%ymm13'
invntt.s: invntt.s:78: Error: bad register name `%ymm9'
invntt.s: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer avx2