Implementation notes: x86, titan0, crypto_sign/dilithium3

Computer: titan0
Architecture: x86
CPU ID: GenuineIntel-000306c3-bfebfbff
SUPERCOP version: 20190803
Operation: crypto_sign
Primitive: dilithium3
TimeImplementationCompilerBenchmark dateSUPERCOP version
4808588refgcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
4837880refgcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer2019080520190803
4871164refgcc -m32 -march=corei7 -O3 -fomit-frame-pointer2019080520190803
4880940refgcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2019080520190803
4882448refgcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
4920168refgcc -m32 -march=core2 -O3 -fomit-frame-pointer2019080520190803
4944064refgcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
4955400refgcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2019080520190803
4970988refgcc -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
4975204refgcc -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
4984908refgcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2019080520190803
5024276refgcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
5030240refgcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
5063608refgcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
5076740refgcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
5113356refgcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
5140120refgcc -funroll-loops -m32 -O3 -fomit-frame-pointer2019080520190803
5156092refgcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2019080520190803
5183316refgcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer2019080520190803
5188564refgcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2019080520190803
5193660refgcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer2019080520190803
5212932refgcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer2019080520190803
5213476refgcc -funroll-loops -m32 -O2 -fomit-frame-pointer2019080520190803
5220884refgcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer2019080520190803
5231688refgcc -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
5232332refgcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
5232464refgcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
5232748refgcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
5235988refgcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
5241384refgcc -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
5246688refgcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
5250596refgcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer2019080520190803
5254236refgcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2019080520190803
5254656refgcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
5256060refgcc -m32 -march=corei7 -O2 -fomit-frame-pointer2019080520190803
5257120refgcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
5275432refgcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
5281508refgcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
5282348refgcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
5288308refgcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
5290272refgcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
5293784refgcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
5297876refgcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
5305324refgcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
5308872refgcc -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
5323696refgcc -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
5324180refgcc -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
5325676refgcc -m32 -O3 -fomit-frame-pointer2019080520190803
5331248refgcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
5342844refgcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
5345572refgcc -m32 -march=core2 -O2 -fomit-frame-pointer2019080520190803
5416584refgcc -funroll-loops -m32 -O -fomit-frame-pointer2019080520190803
5425048refgcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
5440328refgcc -m32 -march=core-avx2 -Os -fomit-frame-pointer2019080520190803
5450536refgcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer2019080520190803
5451536refgcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
5462028refgcc -m32 -march=core-avx-i -Os -fomit-frame-pointer2019080520190803
5464684refgcc -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
5470020refgcc -m32 -march=corei7-avx -Os -fomit-frame-pointer2019080520190803
5471980refgcc -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
5472520refgcc -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
5472912refgcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
5473300refgcc -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
5485780refgcc -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
5493816refgcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
5493968refgcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
5497720refgcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
5501600refgcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
5568924refgcc -m32 -march=core-avx2 -O -fomit-frame-pointer2019080520190803
5574896refgcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2019080520190803
5589240refgcc -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
5600200refgcc -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
5601164refgcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
5605072refgcc -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
5608832refgcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
5618396refgcc -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
5632080refgcc -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
5638156refgcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
5654812refgcc -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
5673304refgcc -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
5691672refgcc -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
5701184refgcc -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
5783340refgcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer2019080520190803
5787180refgcc -m32 -march=corei7 -O -fomit-frame-pointer2019080520190803
5788840refgcc -m32 -march=core-avx-i -O -fomit-frame-pointer2019080520190803
5794244refgcc -m32 -march=core2 -O -fomit-frame-pointer2019080520190803
5794900refgcc -m32 -march=corei7-avx -O -fomit-frame-pointer2019080520190803
5813572refgcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer2019080520190803
5826492refgcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
5842016refgcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
5844836refgcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
5905004refgcc -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
5908144refgcc -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
5915188refgcc -m32 -O2 -fomit-frame-pointer2019080520190803
5921540refgcc -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
5928816refgcc -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
5932820refgcc -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
5955076refgcc -m32 -march=core2 -Os -fomit-frame-pointer2019080520190803
5956340refgcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
5956464refgcc -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
5957756refgcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
5958620refgcc -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
5962176refgcc -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
5978360refgcc -m32 -O -fomit-frame-pointer2019080520190803
5980396refgcc -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
5987616refgcc -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
5996684refgcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
5998676refgcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
6002892refgcc -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
6007400refgcc -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
6008524refgcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer2019080520190803
6009496refgcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
6010884refgcc -m32 -Os -fomit-frame-pointer2019080520190803
6011968refgcc -m32 -march=corei7 -Os -fomit-frame-pointer2019080520190803
6019976refgcc -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
6028988refgcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer2019080520190803
6043944refgcc -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
6044860refgcc -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
6052332refgcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
6054864refgcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
6061748refgcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
6063036refgcc -funroll-loops -m32 -Os -fomit-frame-pointer2019080520190803
6067488refgcc -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
6067604refgcc -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
6068876refgcc -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
6068884refgcc -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
6070312refgcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
6073536refgcc -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
6076548refgcc -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
6078208refgcc -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
6080940refgcc -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
6086996refgcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
6103172refgcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
6105988refgcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
6117212refgcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
6129112refgcc -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
6180976refgcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
6203440refgcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
6263892refgcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
6295168refgcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
6304312refgcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
6329620refgcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
6411872refgcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
6418756refgcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
6425804refgcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
6465488refgcc -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
6605828refgcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
6614480refgcc -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
6684036refgcc -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
6707472refgcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
6849644refgcc -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
6951080refgcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
6956272refgcc -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
6964940refgcc -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
6969108refgcc -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
6998748refgcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
7040892refgcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
7137996refgcc -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
7183416refgcc -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
7491812refgcc -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
8452224refgcc -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
8462816refgcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
8526472refgcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
8610120refgcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
8725624refgcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803
8731232refgcc -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
8775260refgcc -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
8902348refgcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
9123008refgcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
9184584refgcc -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
9283048refgcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
9353260refgcc -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803
9375376refgcc -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
9441140refgcc -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
9546388refgcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
9571312refgcc -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803

Compiler output

Implementation: crypto_sign/dilithium3/avx2
Compiler: gcc -funroll-loops -m32 -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 156, namely:
CompilerImplementations
gcc -funroll-loops -m32 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer avx2
gcc -m32 -O2 -fomit-frame-pointer avx2
gcc -m32 -O3 -fomit-frame-pointer avx2
gcc -m32 -O -fomit-frame-pointer avx2
gcc -m32 -Os -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -m32 -march=prescott -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium3/avx2
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium3/avx2
Compiler: gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:142:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+2], lanes2 ),\
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium3/avx2
Compiler: gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer
invntt.s: invntt.s: Assembler messages:
invntt.s: invntt.s:47: Error: bad register name `%rip)'
invntt.s: invntt.s:48: Error: bad register name `%rip)'
invntt.s: invntt.s:49: Error: bad register name `%rip)'
invntt.s: invntt.s:52: Error: bad register name `%rsi)'
invntt.s: invntt.s:53: Error: bad register name `%rsi)'
invntt.s: invntt.s:54: Error: bad register name `%rsi)'
invntt.s: invntt.s:55: Error: bad register name `%rsi)'
invntt.s: invntt.s:58: Error: bad register name `%ymm8'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:66: Error: bad register name `%ymm8'
invntt.s: invntt.s:67: Error: bad register name `%ymm10'
invntt.s: invntt.s:70: Error: bad register name `%rdx)'
invntt.s: invntt.s:71: Error: bad register name `%rdx)'
invntt.s: invntt.s:72: Error: bad register name `%ymm12'
invntt.s: invntt.s:73: Error: bad register name `%ymm13'
invntt.s: invntt.s:74: Error: bad register name `%ymm8'
invntt.s: invntt.s:76: Error: bad register name `%ymm12'
invntt.s: invntt.s:77: Error: bad register name `%ymm13'
invntt.s: invntt.s:78: Error: bad register name `%ymm9'
invntt.s: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer avx2