Implementation notes: x86, samba, crypto_sign/dilithium2aes

Computer: samba
Architecture: x86
CPU ID: GenuineIntel-000506e3-bfebfbff
SUPERCOP version: 20190803
Operation: crypto_sign
Primitive: dilithium2aes
TimeImplementationCompilerBenchmark dateSUPERCOP version
3845863refgcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
3913779refgcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer2019080520190803
3919189refgcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
3919594refgcc -m32 -march=core-avx2 -Os -fomit-frame-pointer2019080520190803
3943849refgcc -funroll-loops -m32 -O3 -fomit-frame-pointer2019080520190803
3956412refgcc -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
3960020refgcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
3966478refgcc -funroll-loops -m32 -O2 -fomit-frame-pointer2019080520190803
3966778refgcc -m32 -march=corei7-avx -Os -fomit-frame-pointer2019080520190803
3972840refgcc -m32 -march=core-avx-i -Os -fomit-frame-pointer2019080520190803
3973901refgcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
3982006refgcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
3983738refgcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
3995522refgcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
3998305refgcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
4005972refgcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
4009308refgcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
4025379refgcc -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
4025891refgcc -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
4032714refgcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
4035189refgcc -funroll-loops -m32 -O -fomit-frame-pointer2019080520190803
4041345refgcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
4041550refgcc -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
4049095refgcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
4052794refgcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
4052923refgcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
4062409refgcc -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
4067629refgcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
4093194refgcc -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
4110567refgcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer2019080520190803
4113738refgcc -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
4115583refgcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
4121764refgcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
4124845refgcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
4128425refgcc -m32 -O3 -fomit-frame-pointer2019080520190803
4128962refgcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
4132049refgcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
4133676refgcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer2019080520190803
4142905refgcc -m32 -march=corei7 -Os -fomit-frame-pointer2019080520190803
4146863refgcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
4147780refgcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
4151964refgcc -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
4152274refgcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
4152347refgcc -m32 -march=core2 -Os -fomit-frame-pointer2019080520190803
4153565refgcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
4161969refgcc -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
4178752refgcc -m32 -O -fomit-frame-pointer2019080520190803
4183343refgcc -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
4184466refgcc -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
4197787refgcc -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
4215307refgcc -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
4216438refgcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
4224086refgcc -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
4226863refgcc -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
4228362refgcc -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
4230313refgcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
4230520refgcc -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
4244593refgcc -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
4247548refgcc -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
4248387refgcc -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
4249706refgcc -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
4258022refgcc -m32 -O2 -fomit-frame-pointer2019080520190803
4270754refgcc -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
4302649refgcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
4310082refgcc -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
4314406refgcc -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
4315391refgcc -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
4317552refgcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
4320460refgcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer2019080520190803
4333191refgcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
4333642refgcc -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
4337546refgcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
4344309refgcc -funroll-loops -m32 -Os -fomit-frame-pointer2019080520190803
4345839refgcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
4346169refgcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2019080520190803
4354936refgcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
4360371refgcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
4360627refgcc -m32 -march=corei7 -O3 -fomit-frame-pointer2019080520190803
4366387refgcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
4366710refgcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
4367793refgcc -m32 -march=core2 -O3 -fomit-frame-pointer2019080520190803
4373795refgcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
4374272refgcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
4377824refgcc -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
4383805refgcc -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
4403291refgcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
4403714refgcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
4414213refgcc -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
4421440refgcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
4422063refgcc -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
4434861refgcc -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
4437821refgcc -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
4437861refgcc -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
4441201refgcc -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
4443643refgcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
4449591refgcc -m32 -Os -fomit-frame-pointer2019080520190803
4451734refgcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
4452923refgcc -m32 -march=corei7-avx -O -fomit-frame-pointer2019080520190803
4453687refgcc -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
4456052refgcc -m32 -march=core-avx2 -O -fomit-frame-pointer2019080520190803
4463381refgcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
4464577refgcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
4464810refgcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
4465682refgcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer2019080520190803
4469403refgcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
4479201refgcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
4481582refgcc -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
4487004refgcc -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
4491669refgcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2019080520190803
4492452refgcc -m32 -march=corei7 -O -fomit-frame-pointer2019080520190803
4497609refgcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer2019080520190803
4498363refgcc -m32 -march=corei7 -O2 -fomit-frame-pointer2019080520190803
4531234refgcc -m32 -march=core2 -O -fomit-frame-pointer2019080520190803
4531742refgcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2019080520190803
4535083refgcc -m32 -march=core-avx-i -O -fomit-frame-pointer2019080520190803
4545518refgcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
4553235refgcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
4553694refgcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
4560093refgcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
4569088refgcc -m32 -march=core2 -O2 -fomit-frame-pointer2019080520190803
4575423refgcc -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
4575498refgcc -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
4579771refgcc -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
4579943refgcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
4588300refgcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer2019080520190803
4589780refgcc -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
4593603refgcc -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
4595016refgcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
4595659refgcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
4602482refgcc -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
4603488refgcc -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
4642766refgcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2019080520190803
4644772refgcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
4649949refgcc -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
4653269refgcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer2019080520190803
4669794refgcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
4674018refgcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer2019080520190803
4698897refgcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2019080520190803
4716228refgcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2019080520190803
4722064refgcc -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
4729020refgcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
4736250refgcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
4770331refgcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer2019080520190803
4770750refgcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2019080520190803
4795091refgcc -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
4796313refgcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer2019080520190803
4800337refgcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
4819702refgcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
4823480refgcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
4836562refgcc -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
4837855refgcc -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
4843157refgcc -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
4845597refgcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
4850683refgcc -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
4856839refgcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
4892151refgcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
4954340refgcc -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
4963232refgcc -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
5005762refgcc -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
5071316refgcc -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
5891592refgcc -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
5913097refgcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
6040470refgcc -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
6068724refgcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
6188117refgcc -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
6636907refgcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
6713829refgcc -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
6747924refgcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
6983776refgcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
6988113refgcc -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
7145346refgcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
7593879refgcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
7598046refgcc -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
7818189refgcc -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
7975452refgcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803
8269828refgcc -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803

Compiler output

Implementation: crypto_sign/dilithium2aes/avx2
Compiler: gcc -funroll-loops -m32 -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 156, namely:
CompilerImplementations
gcc -funroll-loops -m32 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer avx2
gcc -m32 -O2 -fomit-frame-pointer avx2
gcc -m32 -O3 -fomit-frame-pointer avx2
gcc -m32 -O -fomit-frame-pointer avx2
gcc -m32 -Os -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -m32 -march=prescott -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium2aes/avx2
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium2aes/avx2
Compiler: gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:142:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+2], lanes2 ),\
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium2aes/avx2
Compiler: gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer
invntt.s: invntt.s: Assembler messages:
invntt.s: invntt.s:47: Error: bad register name `%rip)'
invntt.s: invntt.s:48: Error: bad register name `%rip)'
invntt.s: invntt.s:49: Error: bad register name `%rip)'
invntt.s: invntt.s:52: Error: bad register name `%rsi)'
invntt.s: invntt.s:53: Error: bad register name `%rsi)'
invntt.s: invntt.s:54: Error: bad register name `%rsi)'
invntt.s: invntt.s:55: Error: bad register name `%rsi)'
invntt.s: invntt.s:58: Error: bad register name `%ymm8'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:66: Error: bad register name `%ymm8'
invntt.s: invntt.s:67: Error: bad register name `%ymm10'
invntt.s: invntt.s:70: Error: bad register name `%rdx)'
invntt.s: invntt.s:71: Error: bad register name `%rdx)'
invntt.s: invntt.s:72: Error: bad register name `%ymm12'
invntt.s: invntt.s:73: Error: bad register name `%ymm13'
invntt.s: invntt.s:74: Error: bad register name `%ymm8'
invntt.s: invntt.s:76: Error: bad register name `%ymm12'
invntt.s: invntt.s:77: Error: bad register name `%ymm13'
invntt.s: invntt.s:78: Error: bad register name `%ymm9'
invntt.s: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer avx2