Implementation notes: x86, samba, crypto_sign/dilithium3

Computer: samba
Architecture: x86
CPU ID: GenuineIntel-000506e3-bfebfbff
SUPERCOP version: 20190803
Operation: crypto_sign
Primitive: dilithium3
TimeImplementationCompilerBenchmark dateSUPERCOP version
4258533refgcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
4309314refgcc -m32 -march=corei7 -O3 -fomit-frame-pointer2019080520190803
4324778refgcc -m32 -march=core2 -O3 -fomit-frame-pointer2019080520190803
4328442refgcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer2019080520190803
4368031refgcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2019080520190803
4370534refgcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2019080520190803
4390623refgcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
4410450refgcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2019080520190803
4411911refgcc -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
4446994refgcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
4450084refgcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
4454750refgcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
4483163refgcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer2019080520190803
4490246refgcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
4494901refgcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
4505522refgcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer2019080520190803
4506433refgcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2019080520190803
4507326refgcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2019080520190803
4516653refgcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer2019080520190803
4527303refgcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer2019080520190803
4527625refgcc -m32 -march=corei7 -O2 -fomit-frame-pointer2019080520190803
4546117refgcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2019080520190803
4551266refgcc -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
4563326refgcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
4567149refgcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer2019080520190803
4627686refgcc -funroll-loops -m32 -O3 -fomit-frame-pointer2019080520190803
4633438refgcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
4638385refgcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
4646634refgcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
4667968refgcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
4675924refgcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
4677806refgcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
4678177refgcc -m32 -march=core2 -O2 -fomit-frame-pointer2019080520190803
4683510refgcc -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
4689217refgcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
4689647refgcc -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
4692672refgcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
4693657refgcc -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
4694992refgcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
4696899refgcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
4700941refgcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
4702345refgcc -funroll-loops -m32 -O2 -fomit-frame-pointer2019080520190803
4703274refgcc -m32 -march=core-avx2 -Os -fomit-frame-pointer2019080520190803
4709346refgcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer2019080520190803
4715724refgcc -m32 -march=core-avx-i -Os -fomit-frame-pointer2019080520190803
4716911refgcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
4724424refgcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
4727478refgcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
4729393refgcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
4741570refgcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
4753637refgcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
4753895refgcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
4754912refgcc -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
4757078refgcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer2019080520190803
4761776refgcc -m32 -march=corei7-avx -Os -fomit-frame-pointer2019080520190803
4769581refgcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
4776012refgcc -m32 -march=corei7 -Os -fomit-frame-pointer2019080520190803
4787480refgcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer2019080520190803
4787559refgcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
4796322refgcc -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
4799117refgcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
4801744refgcc -funroll-loops -m32 -O -fomit-frame-pointer2019080520190803
4802441refgcc -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
4803553refgcc -m32 -march=core2 -Os -fomit-frame-pointer2019080520190803
4810518refgcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
4817419refgcc -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
4837298refgcc -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
4843712refgcc -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
4849815refgcc -m32 -O3 -fomit-frame-pointer2019080520190803
4859835refgcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
4862095refgcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
4866258refgcc -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
4894191refgcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
4894379refgcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
4895273refgcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
4899150refgcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
4908742refgcc -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
4909494refgcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
4909661refgcc -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
4910613refgcc -m32 -march=core-avx2 -O -fomit-frame-pointer2019080520190803
4911645refgcc -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
4938404refgcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
4947039refgcc -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
4953078refgcc -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
4972191refgcc -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
4975320refgcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2019080520190803
4992147refgcc -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
5026016refgcc -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
5026426refgcc -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
5061805refgcc -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
5078149refgcc -m32 -march=core-avx-i -O -fomit-frame-pointer2019080520190803
5081307refgcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer2019080520190803
5093440refgcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
5093605refgcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
5096498refgcc -m32 -march=corei7-avx -O -fomit-frame-pointer2019080520190803
5102534refgcc -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
5103941refgcc -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
5104080refgcc -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
5107278refgcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
5124722refgcc -m32 -march=core2 -O -fomit-frame-pointer2019080520190803
5126572refgcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer2019080520190803
5135191refgcc -m32 -march=corei7 -O -fomit-frame-pointer2019080520190803
5135990refgcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
5140930refgcc -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
5163855refgcc -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
5166944refgcc -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
5168833refgcc -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
5179589refgcc -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
5198845refgcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
5199528refgcc -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
5219130refgcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
5220934refgcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
5227218refgcc -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
5228312refgcc -funroll-loops -m32 -Os -fomit-frame-pointer2019080520190803
5229637refgcc -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
5233012refgcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
5236669refgcc -m32 -Os -fomit-frame-pointer2019080520190803
5244681refgcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
5254166refgcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
5254738refgcc -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
5258232refgcc -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
5258924refgcc -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
5283121refgcc -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
5290659refgcc -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
5291580refgcc -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
5293306refgcc -m32 -O2 -fomit-frame-pointer2019080520190803
5296399refgcc -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
5300708refgcc -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
5305354refgcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
5308750refgcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
5316294refgcc -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
5333790refgcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
5347546refgcc -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
5349853refgcc -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
5350871refgcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
5352500refgcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
5372666refgcc -m32 -O -fomit-frame-pointer2019080520190803
5410770refgcc -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
5467789refgcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
5501046refgcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
5517055refgcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
5521032refgcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
5540283refgcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
5542990refgcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
5548863refgcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
5562508refgcc -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
5769619refgcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
5782189refgcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
5790203refgcc -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
5842859refgcc -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
5910575refgcc -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
5973195refgcc -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
5984004refgcc -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
6011813refgcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
6020426refgcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
6130722refgcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
6143020refgcc -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
6227825refgcc -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
6273932refgcc -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
6558288refgcc -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
8117313refgcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
8267596refgcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803
8299976refgcc -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
8351335refgcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
8624399refgcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
8640953refgcc -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
8689228refgcc -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
8691901refgcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
8786063refgcc -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803
8925512refgcc -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
8937574refgcc -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
8977819refgcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
9018385refgcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
9065522refgcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
9123863refgcc -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
9270215refgcc -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803

Compiler output

Implementation: crypto_sign/dilithium3/avx2
Compiler: gcc -funroll-loops -m32 -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 156, namely:
CompilerImplementations
gcc -funroll-loops -m32 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer avx2
gcc -m32 -O2 -fomit-frame-pointer avx2
gcc -m32 -O3 -fomit-frame-pointer avx2
gcc -m32 -O -fomit-frame-pointer avx2
gcc -m32 -Os -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -m32 -march=prescott -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium3/avx2
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium3/avx2
Compiler: gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:142:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+2], lanes2 ),\
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium3/avx2
Compiler: gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer
invntt.s: invntt.s: Assembler messages:
invntt.s: invntt.s:47: Error: bad register name `%rip)'
invntt.s: invntt.s:48: Error: bad register name `%rip)'
invntt.s: invntt.s:49: Error: bad register name `%rip)'
invntt.s: invntt.s:52: Error: bad register name `%rsi)'
invntt.s: invntt.s:53: Error: bad register name `%rsi)'
invntt.s: invntt.s:54: Error: bad register name `%rsi)'
invntt.s: invntt.s:55: Error: bad register name `%rsi)'
invntt.s: invntt.s:58: Error: bad register name `%ymm8'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:66: Error: bad register name `%ymm8'
invntt.s: invntt.s:67: Error: bad register name `%ymm10'
invntt.s: invntt.s:70: Error: bad register name `%rdx)'
invntt.s: invntt.s:71: Error: bad register name `%rdx)'
invntt.s: invntt.s:72: Error: bad register name `%ymm12'
invntt.s: invntt.s:73: Error: bad register name `%ymm13'
invntt.s: invntt.s:74: Error: bad register name `%ymm8'
invntt.s: invntt.s:76: Error: bad register name `%ymm12'
invntt.s: invntt.s:77: Error: bad register name `%ymm13'
invntt.s: invntt.s:78: Error: bad register name `%ymm9'
invntt.s: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer avx2