Implementation notes: x86, titan0, crypto_sign/dilithium2

Computer: titan0
Architecture: x86
CPU ID: GenuineIntel-000306c3-bfebfbff
SUPERCOP version: 20190803
Operation: crypto_sign
Primitive: dilithium2
TimeImplementationCompilerBenchmark dateSUPERCOP version
4832372refgcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
4880980refgcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
4891320refgcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
4933136refgcc -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
4955596refgcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer2019080520190803
4957412refgcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
4961352refgcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2019080520190803
4966892refgcc -m32 -march=core2 -O3 -fomit-frame-pointer2019080520190803
4976664refgcc -m32 -march=corei7 -O3 -fomit-frame-pointer2019080520190803
5019776refgcc -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
5076624refgcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
5080852refgcc -funroll-loops -m32 -O3 -fomit-frame-pointer2019080520190803
5102864refgcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2019080520190803
5118380refgcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
5121544refgcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2019080520190803
5137364refgcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
5150192refgcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
5154180refgcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
5154776refgcc -funroll-loops -m32 -O2 -fomit-frame-pointer2019080520190803
5160348refgcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
5162972refgcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
5181540refgcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
5184192refgcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
5185212refgcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
5195412refgcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
5195700refgcc -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
5200824refgcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
5208956refgcc -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
5209780refgcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
5218208refgcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2019080520190803
5231004refgcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
5234500refgcc -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
5234920refgcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
5235204refgcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer2019080520190803
5243820refgcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2019080520190803
5244540refgcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
5245012refgcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
5250860refgcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
5253348refgcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer2019080520190803
5261340refgcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
5261476refgcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer2019080520190803
5265540refgcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2019080520190803
5275004refgcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
5275724refgcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer2019080520190803
5280652refgcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
5282496refgcc -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
5301012refgcc -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
5302104refgcc -m32 -march=corei7 -O2 -fomit-frame-pointer2019080520190803
5309916refgcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer2019080520190803
5319928refgcc -m32 -O3 -fomit-frame-pointer2019080520190803
5391972refgcc -funroll-loops -m32 -O -fomit-frame-pointer2019080520190803
5400260refgcc -m32 -march=core2 -O2 -fomit-frame-pointer2019080520190803
5455892refgcc -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
5461116refgcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
5467460refgcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
5473192refgcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
5476388refgcc -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
5502048refgcc -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
5509656refgcc -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
5515832refgcc -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
5516152refgcc -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
5521796refgcc -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
5548676refgcc -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
5549800refgcc -m32 -march=corei7-avx -Os -fomit-frame-pointer2019080520190803
5551596refgcc -m32 -march=core-avx-i -Os -fomit-frame-pointer2019080520190803
5554984refgcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
5555676refgcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
5565024refgcc -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
5576748refgcc -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
5585808refgcc -m32 -march=core-avx2 -Os -fomit-frame-pointer2019080520190803
5587112refgcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer2019080520190803
5589576refgcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
5593256refgcc -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
5598404refgcc -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
5598672refgcc -m32 -march=core-avx2 -O -fomit-frame-pointer2019080520190803
5615896refgcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
5618060refgcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2019080520190803
5628260refgcc -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
5643076refgcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
5696688refgcc -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
5726040refgcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
5739944refgcc -m32 -march=core2 -O -fomit-frame-pointer2019080520190803
5740576refgcc -m32 -march=core-avx-i -O -fomit-frame-pointer2019080520190803
5741068refgcc -m32 -march=corei7 -O -fomit-frame-pointer2019080520190803
5746264refgcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer2019080520190803
5750576refgcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer2019080520190803
5751092refgcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
5759876refgcc -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
5761840refgcc -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
5762644refgcc -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
5778508refgcc -m32 -march=corei7-avx -O -fomit-frame-pointer2019080520190803
5785284refgcc -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
5793564refgcc -m32 -O2 -fomit-frame-pointer2019080520190803
5800784refgcc -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
5802720refgcc -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
5818112refgcc -m32 -O -fomit-frame-pointer2019080520190803
5844632refgcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
5851812refgcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
5861324refgcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
5877952refgcc -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
5879768refgcc -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
5886508refgcc -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
5889484refgcc -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
5896372refgcc -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
5977228refgcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
5982224refgcc -funroll-loops -m32 -Os -fomit-frame-pointer2019080520190803
5984740refgcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
6000184refgcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
6003628refgcc -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
6018592refgcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
6033052refgcc -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
6034140refgcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
6035560refgcc -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
6040412refgcc -m32 -Os -fomit-frame-pointer2019080520190803
6042288refgcc -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
6044844refgcc -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
6057536refgcc -m32 -march=core2 -Os -fomit-frame-pointer2019080520190803
6059960refgcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
6065780refgcc -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
6082568refgcc -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
6091392refgcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer2019080520190803
6096940refgcc -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
6106388refgcc -m32 -march=corei7 -Os -fomit-frame-pointer2019080520190803
6111828refgcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
6112992refgcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
6115336refgcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer2019080520190803
6122620refgcc -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
6122956refgcc -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
6124072refgcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
6133808refgcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
6155984refgcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
6169724refgcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
6173704refgcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
6173796refgcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
6205056refgcc -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
6206960refgcc -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
6226920refgcc -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
6230372refgcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
6268352refgcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
6301176refgcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
6305796refgcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
6315248refgcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
6451760refgcc -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
6471908refgcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
6498440refgcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
6507828refgcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
6627552refgcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
6665184refgcc -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
6720248refgcc -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
6801960refgcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
6841472refgcc -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
6892096refgcc -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
6905128refgcc -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
6917200refgcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
6937828refgcc -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
7106244refgcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
7128684refgcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
7243052refgcc -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
7307524refgcc -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
7591060refgcc -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
8120812refgcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
8256100refgcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803
8279380refgcc -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
8421380refgcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
8592880refgcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
8650784refgcc -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
8841496refgcc -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803
8844024refgcc -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
8889764refgcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
8951064refgcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
8979948refgcc -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
8994232refgcc -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
8999276refgcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
9074304refgcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
9099784refgcc -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
9224376refgcc -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803

Compiler output

Implementation: crypto_sign/dilithium2/avx2
Compiler: gcc -funroll-loops -m32 -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 156, namely:
CompilerImplementations
gcc -funroll-loops -m32 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer avx2
gcc -m32 -O2 -fomit-frame-pointer avx2
gcc -m32 -O3 -fomit-frame-pointer avx2
gcc -m32 -O -fomit-frame-pointer avx2
gcc -m32 -Os -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -m32 -march=prescott -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium2/avx2
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium2/avx2
Compiler: gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:142:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+2], lanes2 ),\
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium2/avx2
Compiler: gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer
invntt.s: invntt.s: Assembler messages:
invntt.s: invntt.s:47: Error: bad register name `%rip)'
invntt.s: invntt.s:48: Error: bad register name `%rip)'
invntt.s: invntt.s:49: Error: bad register name `%rip)'
invntt.s: invntt.s:52: Error: bad register name `%rsi)'
invntt.s: invntt.s:53: Error: bad register name `%rsi)'
invntt.s: invntt.s:54: Error: bad register name `%rsi)'
invntt.s: invntt.s:55: Error: bad register name `%rsi)'
invntt.s: invntt.s:58: Error: bad register name `%ymm8'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:66: Error: bad register name `%ymm8'
invntt.s: invntt.s:67: Error: bad register name `%ymm10'
invntt.s: invntt.s:70: Error: bad register name `%rdx)'
invntt.s: invntt.s:71: Error: bad register name `%rdx)'
invntt.s: invntt.s:72: Error: bad register name `%ymm12'
invntt.s: invntt.s:73: Error: bad register name `%ymm13'
invntt.s: invntt.s:74: Error: bad register name `%ymm8'
invntt.s: invntt.s:76: Error: bad register name `%ymm12'
invntt.s: invntt.s:77: Error: bad register name `%ymm13'
invntt.s: invntt.s:78: Error: bad register name `%ymm9'
invntt.s: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer avx2