Implementation notes: x86, titan0, crypto_sign/dilithium3aes

Computer: titan0
Architecture: x86
CPU ID: GenuineIntel-000306c3-bfebfbff
SUPERCOP version: 20190803
Operation: crypto_sign
Primitive: dilithium3aes
TimeImplementationCompilerBenchmark dateSUPERCOP version
8055456refgcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
8168092refgcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
8250128refgcc -funroll-loops -m32 -O3 -fomit-frame-pointer2019080520190803
8298612refgcc -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
8397900refgcc -funroll-loops -m32 -O2 -fomit-frame-pointer2019080520190803
8404044refgcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
8404812refgcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
8414636refgcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
8450756refgcc -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
8477600refgcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
8490704refgcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
8492216refgcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
8507520refgcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer2019080520190803
8507720refgcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
8508356refgcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
8509176refgcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
8518000refgcc -m32 -march=core-avx2 -Os -fomit-frame-pointer2019080520190803
8543760refgcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
8551564refgcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
8558452refgcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
8562940refgcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
8563400refgcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
8570856refgcc -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
8579996refgcc -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
8597528refgcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
8604260refgcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
8637908refgcc -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
8646684refgcc -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
8647396refgcc -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
8692296refgcc -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
8694872refgcc -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
8722116refgcc -funroll-loops -m32 -O -fomit-frame-pointer2019080520190803
8725864refgcc -m32 -O3 -fomit-frame-pointer2019080520190803
8730096refgcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
8733780refgcc -m32 -march=core-avx-i -Os -fomit-frame-pointer2019080520190803
8747036refgcc -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
8748736refgcc -m32 -march=corei7-avx -Os -fomit-frame-pointer2019080520190803
8792488refgcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
8794024refgcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
8802540refgcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
8814368refgcc -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
8829040refgcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
8836516refgcc -m32 -O2 -fomit-frame-pointer2019080520190803
8875812refgcc -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
8879752refgcc -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
8937820refgcc -m32 -O -fomit-frame-pointer2019080520190803
8949596refgcc -m32 -march=core2 -O3 -fomit-frame-pointer2019080520190803
8980252refgcc -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
8981844refgcc -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
8987640refgcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2019080520190803
9000588refgcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer2019080520190803
9004016refgcc -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
9017700refgcc -m32 -march=corei7 -O3 -fomit-frame-pointer2019080520190803
9027072refgcc -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
9027736refgcc -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
9079000refgcc -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
9188156refgcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
9249296refgcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
9256524refgcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
9261168refgcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
9285616refgcc -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
9289760refgcc -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
9292808refgcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
9295212refgcc -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
9298416refgcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
9338224refgcc -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
9341596refgcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
9353392refgcc -m32 -march=corei7 -O2 -fomit-frame-pointer2019080520190803
9358036refgcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer2019080520190803
9358880refgcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
9361080refgcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
9370752refgcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
9386104refgcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
9391672refgcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
9403064refgcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2019080520190803
9403168refgcc -m32 -march=core2 -O2 -fomit-frame-pointer2019080520190803
9405860refgcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
9406480refgcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
9411960refgcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
9412340refgcc -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
9428608refgcc -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
9432072refgcc -funroll-loops -m32 -Os -fomit-frame-pointer2019080520190803
9434640refgcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2019080520190803
9437044refgcc -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
9444568refgcc -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
9456168refgcc -m32 -Os -fomit-frame-pointer2019080520190803
9459204refgcc -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
9460040refgcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
9465572refgcc -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
9467312refgcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2019080520190803
9467720refgcc -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
9510420refgcc -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
9520772refgcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer2019080520190803
9572640refgcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer2019080520190803
9574536refgcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2019080520190803
9587876refgcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2019080520190803
9601276refgcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
9606656refgcc -m32 -march=core-avx2 -O -fomit-frame-pointer2019080520190803
9607056refgcc -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
9628624refgcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
9629100refgcc -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
9629172refgcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2019080520190803
9632836refgcc -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
9643804refgcc -m32 -march=core2 -O -fomit-frame-pointer2019080520190803
9645404refgcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer2019080520190803
9646124refgcc -m32 -march=core-avx-i -O -fomit-frame-pointer2019080520190803
9657104refgcc -m32 -march=corei7-avx -O -fomit-frame-pointer2019080520190803
9657300refgcc -m32 -march=corei7 -O -fomit-frame-pointer2019080520190803
9657992refgcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer2019080520190803
9680800refgcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer2019080520190803
9723824refgcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
9728924refgcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer2019080520190803
9778792refgcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
9785956refgcc -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
9811192refgcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
9812148refgcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
9812392refgcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
9826920refgcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
9839336refgcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
9901928refgcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
9931532refgcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
9939020refgcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
9941008refgcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
9942068refgcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
9942184refgcc -m32 -march=core2 -Os -fomit-frame-pointer2019080520190803
9956540refgcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
9960444refgcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
9977280refgcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
9994612refgcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
10006396refgcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
10007644refgcc -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
10010900refgcc -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
10013448refgcc -m32 -march=corei7 -Os -fomit-frame-pointer2019080520190803
10014940refgcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer2019080520190803
10018844refgcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer2019080520190803
10025852refgcc -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
10058552refgcc -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
10081600refgcc -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
10082996refgcc -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
10086596refgcc -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
10095164refgcc -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
10147716refgcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
10148736refgcc -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
10152456refgcc -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
10153848refgcc -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
10172604refgcc -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
10201244refgcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
10214120refgcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
10264412refgcc -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
10300620refgcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
10345596refgcc -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
10384948refgcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
10389784refgcc -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
10410720refgcc -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
10451628refgcc -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
10458848refgcc -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
10723740refgcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
10760212refgcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
10977360refgcc -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
11107416refgcc -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
11442792refgcc -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
12573360refgcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
12748032refgcc -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
12772416refgcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
12800744refgcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
12878364refgcc -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
13125032refgcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
13152048refgcc -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
13167884refgcc -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
13383812refgcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
13466188refgcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
14606776refgcc -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
14609860refgcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
14745996refgcc -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
15219968refgcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803
16129988refgcc -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803

Compiler output

Implementation: crypto_sign/dilithium3aes/avx2
Compiler: gcc -funroll-loops -m32 -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 156, namely:
CompilerImplementations
gcc -funroll-loops -m32 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer avx2
gcc -m32 -O2 -fomit-frame-pointer avx2
gcc -m32 -O3 -fomit-frame-pointer avx2
gcc -m32 -O -fomit-frame-pointer avx2
gcc -m32 -Os -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -m32 -march=prescott -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium3aes/avx2
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium3aes/avx2
Compiler: gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:142:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+2], lanes2 ),\
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium3aes/avx2
Compiler: gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer
invntt.s: invntt.s: Assembler messages:
invntt.s: invntt.s:47: Error: bad register name `%rip)'
invntt.s: invntt.s:48: Error: bad register name `%rip)'
invntt.s: invntt.s:49: Error: bad register name `%rip)'
invntt.s: invntt.s:52: Error: bad register name `%rsi)'
invntt.s: invntt.s:53: Error: bad register name `%rsi)'
invntt.s: invntt.s:54: Error: bad register name `%rsi)'
invntt.s: invntt.s:55: Error: bad register name `%rsi)'
invntt.s: invntt.s:58: Error: bad register name `%ymm8'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:66: Error: bad register name `%ymm8'
invntt.s: invntt.s:67: Error: bad register name `%ymm10'
invntt.s: invntt.s:70: Error: bad register name `%rdx)'
invntt.s: invntt.s:71: Error: bad register name `%rdx)'
invntt.s: invntt.s:72: Error: bad register name `%ymm12'
invntt.s: invntt.s:73: Error: bad register name `%ymm13'
invntt.s: invntt.s:74: Error: bad register name `%ymm8'
invntt.s: invntt.s:76: Error: bad register name `%ymm12'
invntt.s: invntt.s:77: Error: bad register name `%ymm13'
invntt.s: invntt.s:78: Error: bad register name `%ymm9'
invntt.s: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer avx2