Implementation notes: x86, samba, crypto_sign/dilithium2

Computer: samba
Architecture: x86
CPU ID: GenuineIntel-000506e3-bfebfbff
SUPERCOP version: 20190803
Operation: crypto_sign
Primitive: dilithium2
TimeImplementationCompilerBenchmark dateSUPERCOP version
4264613refgcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
4266486refgcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
4384867refgcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
4410555refgcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
4410700refgcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2019080520190803
4428156refgcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
4430403refgcc -m32 -march=corei7 -O3 -fomit-frame-pointer2019080520190803
4436149refgcc -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
4443919refgcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2019080520190803
4445712refgcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer2019080520190803
4456244refgcc -m32 -march=core2 -O3 -fomit-frame-pointer2019080520190803
4460536refgcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
4479103refgcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2019080520190803
4481603refgcc -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
4491933refgcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2019080520190803
4494651refgcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
4503657refgcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer2019080520190803
4506623refgcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2019080520190803
4516434refgcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
4527254refgcc -m32 -march=corei7 -O2 -fomit-frame-pointer2019080520190803
4548852refgcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer2019080520190803
4549077refgcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2019080520190803
4554070refgcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
4556581refgcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
4567357refgcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
4574208refgcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
4577109refgcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
4580429refgcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer2019080520190803
4580595refgcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
4587158refgcc -funroll-loops -m32 -O3 -fomit-frame-pointer2019080520190803
4587443refgcc -funroll-loops -m32 -O2 -fomit-frame-pointer2019080520190803
4592135refgcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
4599752refgcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
4603800refgcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
4612813refgcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
4621852refgcc -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
4625646refgcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
4629873refgcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
4636671refgcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
4649024refgcc -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
4651466refgcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer2019080520190803
4656810refgcc -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
4662170refgcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
4662268refgcc -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
4667891refgcc -m32 -march=core2 -O2 -fomit-frame-pointer2019080520190803
4670081refgcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
4671156refgcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer2019080520190803
4699827refgcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
4720340refgcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
4722894refgcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
4760176refgcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
4788187refgcc -m32 -march=core-avx2 -Os -fomit-frame-pointer2019080520190803
4788202refgcc -funroll-loops -m32 -O -fomit-frame-pointer2019080520190803
4795056refgcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
4800431refgcc -m32 -march=core-avx-i -Os -fomit-frame-pointer2019080520190803
4808948refgcc -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
4814976refgcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer2019080520190803
4815410refgcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
4818796refgcc -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
4819423refgcc -m32 -march=corei7-avx -Os -fomit-frame-pointer2019080520190803
4824648refgcc -m32 -O3 -fomit-frame-pointer2019080520190803
4824980refgcc -m32 -march=corei7 -Os -fomit-frame-pointer2019080520190803
4826072refgcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer2019080520190803
4834294refgcc -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
4835706refgcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer2019080520190803
4836570refgcc -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
4839377refgcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
4846299refgcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
4856534refgcc -m32 -march=core2 -Os -fomit-frame-pointer2019080520190803
4858117refgcc -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
4870367refgcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
4870855refgcc -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
4880953refgcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
4881157refgcc -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
4881474refgcc -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
4883075refgcc -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
4887456refgcc -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
4893435refgcc -m32 -march=core-avx2 -O -fomit-frame-pointer2019080520190803
4904008refgcc -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
4907150refgcc -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
4910000refgcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
4912986refgcc -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
4916690refgcc -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
4918165refgcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
4924145refgcc -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
4940819refgcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2019080520190803
4941978refgcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
4954099refgcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
4956903refgcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
4957643refgcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
4968722refgcc -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
5002746refgcc -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
5006053refgcc -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
5011722refgcc -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
5030238refgcc -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
5041302refgcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer2019080520190803
5044330refgcc -m32 -march=core-avx-i -O -fomit-frame-pointer2019080520190803
5047490refgcc -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
5050982refgcc -m32 -march=core2 -O -fomit-frame-pointer2019080520190803
5058030refgcc -m32 -march=corei7-avx -O -fomit-frame-pointer2019080520190803
5059002refgcc -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
5063579refgcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer2019080520190803
5070910refgcc -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
5074714refgcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
5085042refgcc -m32 -march=corei7 -O -fomit-frame-pointer2019080520190803
5091363refgcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
5102684refgcc -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
5111615refgcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
5120476refgcc -m32 -O2 -fomit-frame-pointer2019080520190803
5135573refgcc -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
5138413refgcc -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
5149380refgcc -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
5163737refgcc -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
5168680refgcc -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
5172375refgcc -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
5183615refgcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
5186746refgcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
5187962refgcc -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
5189205refgcc -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
5198907refgcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
5202588refgcc -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
5204941refgcc -funroll-loops -m32 -Os -fomit-frame-pointer2019080520190803
5210873refgcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
5220240refgcc -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
5222946refgcc -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
5226430refgcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
5228399refgcc -m32 -O -fomit-frame-pointer2019080520190803
5228488refgcc -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
5233454refgcc -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
5241994refgcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
5243437refgcc -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
5246625refgcc -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
5247060refgcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
5255430refgcc -m32 -Os -fomit-frame-pointer2019080520190803
5258931refgcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
5275340refgcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
5281131refgcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
5290891refgcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
5415198refgcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
5416603refgcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
5475357refgcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
5506595refgcc -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
5516942refgcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
5523316refgcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
5549639refgcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
5550994refgcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
5717762refgcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
5721311refgcc -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
5759984refgcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
5798946refgcc -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
5869236refgcc -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
5917494refgcc -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
5919391refgcc -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
6000155refgcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
6038982refgcc -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
6100510refgcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
6137679refgcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
6142889refgcc -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
6324858refgcc -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
6596345refgcc -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
7607529refgcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
7714853refgcc -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
7740024refgcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803
7789153refgcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
8083546refgcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
8111241refgcc -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
8178040refgcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
8231441refgcc -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
8247059refgcc -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803
8347327refgcc -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
8401601refgcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
8417721refgcc -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
9616853refgcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
9682912refgcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
9692520refgcc -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
9887230refgcc -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803

Compiler output

Implementation: crypto_sign/dilithium2/avx2
Compiler: gcc -funroll-loops -m32 -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 156, namely:
CompilerImplementations
gcc -funroll-loops -m32 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer avx2
gcc -m32 -O2 -fomit-frame-pointer avx2
gcc -m32 -O3 -fomit-frame-pointer avx2
gcc -m32 -O -fomit-frame-pointer avx2
gcc -m32 -Os -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -m32 -march=prescott -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium2/avx2
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium2/avx2
Compiler: gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:142:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+2], lanes2 ),\
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium2/avx2
Compiler: gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer
invntt.s: invntt.s: Assembler messages:
invntt.s: invntt.s:47: Error: bad register name `%rip)'
invntt.s: invntt.s:48: Error: bad register name `%rip)'
invntt.s: invntt.s:49: Error: bad register name `%rip)'
invntt.s: invntt.s:52: Error: bad register name `%rsi)'
invntt.s: invntt.s:53: Error: bad register name `%rsi)'
invntt.s: invntt.s:54: Error: bad register name `%rsi)'
invntt.s: invntt.s:55: Error: bad register name `%rsi)'
invntt.s: invntt.s:58: Error: bad register name `%ymm8'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:66: Error: bad register name `%ymm8'
invntt.s: invntt.s:67: Error: bad register name `%ymm10'
invntt.s: invntt.s:70: Error: bad register name `%rdx)'
invntt.s: invntt.s:71: Error: bad register name `%rdx)'
invntt.s: invntt.s:72: Error: bad register name `%ymm12'
invntt.s: invntt.s:73: Error: bad register name `%ymm13'
invntt.s: invntt.s:74: Error: bad register name `%ymm8'
invntt.s: invntt.s:76: Error: bad register name `%ymm12'
invntt.s: invntt.s:77: Error: bad register name `%ymm13'
invntt.s: invntt.s:78: Error: bad register name `%ymm9'
invntt.s: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer avx2