Implementation notes: x86, titan0, crypto_sign/dilithium4aes

Computer: titan0
Architecture: x86
CPU ID: GenuineIntel-000306c3-bfebfbff
SUPERCOP version: 20190803
Operation: crypto_sign
Primitive: dilithium4aes
TimeImplementationCompilerBenchmark dateSUPERCOP version
6274128refgcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer2019080520190803
6337168refgcc -m32 -march=core-avx2 -Os -fomit-frame-pointer2019080520190803
6453184refgcc -m32 -march=corei7-avx -Os -fomit-frame-pointer2019080520190803
6454248refgcc -m32 -march=core-avx-i -Os -fomit-frame-pointer2019080520190803
6541916refgcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
6639556refgcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
6657136refgcc -funroll-loops -m32 -O3 -fomit-frame-pointer2019080520190803
6717012refgcc -funroll-loops -m32 -O2 -fomit-frame-pointer2019080520190803
6752300refgcc -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
6779864refgcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
6789216refgcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
6794256refgcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
6794664refgcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
6797376refgcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
6821404refgcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
6834756refgcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
6848340refgcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
6857996refgcc -funroll-loops -m32 -O -fomit-frame-pointer2019080520190803
6872016refgcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
6872236refgcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
6875008refgcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
6883772refgcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
6895764refgcc -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
6897544refgcc -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
6928516refgcc -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
6932128refgcc -m32 -O3 -fomit-frame-pointer2019080520190803
6935972refgcc -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
6943988refgcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
6950288refgcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
6991648refgcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
6993604refgcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
6996876refgcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
7001000refgcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
7002692refgcc -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
7006540refgcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
7007544refgcc -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
7018544refgcc -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
7035728refgcc -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
7037500refgcc -m32 -march=core2 -O3 -fomit-frame-pointer2019080520190803
7038700refgcc -m32 -O2 -fomit-frame-pointer2019080520190803
7077052refgcc -m32 -O -fomit-frame-pointer2019080520190803
7086252refgcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer2019080520190803
7091884refgcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
7091976refgcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2019080520190803
7094904refgcc -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
7125656refgcc -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
7127896refgcc -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
7131224refgcc -m32 -march=corei7 -O3 -fomit-frame-pointer2019080520190803
7135084refgcc -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
7152448refgcc -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
7180160refgcc -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
7185332refgcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
7186240refgcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
7186956refgcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
7193052refgcc -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
7196520refgcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
7197744refgcc -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
7207448refgcc -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
7222380refgcc -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
7250840refgcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
7277836refgcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
7281920refgcc -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
7284208refgcc -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
7284672refgcc -funroll-loops -m32 -Os -fomit-frame-pointer2019080520190803
7285216refgcc -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
7293496refgcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
7295136refgcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
7308400refgcc -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
7310208refgcc -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
7310772refgcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
7313696refgcc -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
7329160refgcc -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
7331840refgcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
7333088refgcc -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
7341800refgcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
7343392refgcc -m32 -Os -fomit-frame-pointer2019080520190803
7396672refgcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
7441356refgcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
7442288refgcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
7443448refgcc -m32 -march=corei7 -O2 -fomit-frame-pointer2019080520190803
7449180refgcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
7452708refgcc -m32 -march=core2 -O2 -fomit-frame-pointer2019080520190803
7457884refgcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
7469292refgcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2019080520190803
7471576refgcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
7477776refgcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
7493736refgcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
7499032refgcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer2019080520190803
7507964refgcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
7509064refgcc -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
7513792refgcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
7537372refgcc -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
7541400refgcc -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
7546476refgcc -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
7549984refgcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
7551904refgcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
7560988refgcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2019080520190803
7563364refgcc -m32 -march=core2 -O -fomit-frame-pointer2019080520190803
7565436refgcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
7567816refgcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
7575684refgcc -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
7577760refgcc -m32 -march=core-avx-i -O -fomit-frame-pointer2019080520190803
7578716refgcc -m32 -march=corei7-avx -O -fomit-frame-pointer2019080520190803
7579740refgcc -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
7582848refgcc -m32 -march=core-avx2 -O -fomit-frame-pointer2019080520190803
7597628refgcc -m32 -march=corei7 -O -fomit-frame-pointer2019080520190803
7600828refgcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer2019080520190803
7604648refgcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer2019080520190803
7611484refgcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
7615112refgcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2019080520190803
7624772refgcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
7628272refgcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer2019080520190803
7631416refgcc -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
7637308refgcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2019080520190803
7645408refgcc -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
7647292refgcc -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
7659420refgcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer2019080520190803
7669736refgcc -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
7689416refgcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2019080520190803
7694052refgcc -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
7696000refgcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2019080520190803
7698548refgcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
7720352refgcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
7725344refgcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
7725388refgcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
7730260refgcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
7732556refgcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
7748268refgcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer2019080520190803
7754984refgcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
7755300refgcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
7759304refgcc -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
7762292refgcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer2019080520190803
7763376refgcc -m32 -march=core2 -Os -fomit-frame-pointer2019080520190803
7800532refgcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
7819468refgcc -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
7824672refgcc -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
7831932refgcc -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
7835288refgcc -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
7836196refgcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
7841072refgcc -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
7844588refgcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer2019080520190803
7846952refgcc -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
7856348refgcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer2019080520190803
7866160refgcc -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
7877408refgcc -m32 -march=corei7 -Os -fomit-frame-pointer2019080520190803
7891052refgcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
7902948refgcc -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
7935976refgcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
7938204refgcc -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
7958740refgcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
7966764refgcc -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
7996160refgcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
7997220refgcc -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
8084944refgcc -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
8139316refgcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
8144440refgcc -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
8165368refgcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
8175264refgcc -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
8331316refgcc -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
8366392refgcc -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
8881800refgcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
8906112refgcc -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
9170232refgcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
9386092refgcc -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
10172488refgcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
10511808refgcc -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
10562840refgcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
10563612refgcc -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
10590220refgcc -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
10978572refgcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
11209740refgcc -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
11478388refgcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
12765888refgcc -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
12782336refgcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
13484320refgcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803
14280004refgcc -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803

Compiler output

Implementation: crypto_sign/dilithium4aes/avx2
Compiler: gcc -funroll-loops -m32 -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 156, namely:
CompilerImplementations
gcc -funroll-loops -m32 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer avx2
gcc -m32 -O2 -fomit-frame-pointer avx2
gcc -m32 -O3 -fomit-frame-pointer avx2
gcc -m32 -O -fomit-frame-pointer avx2
gcc -m32 -Os -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -m32 -march=prescott -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium4aes/avx2
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium4aes/avx2
Compiler: gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:142:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+2], lanes2 ),\
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium4aes/avx2
Compiler: gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer
invntt.s: invntt.s: Assembler messages:
invntt.s: invntt.s:47: Error: bad register name `%rip)'
invntt.s: invntt.s:48: Error: bad register name `%rip)'
invntt.s: invntt.s:49: Error: bad register name `%rip)'
invntt.s: invntt.s:52: Error: bad register name `%rsi)'
invntt.s: invntt.s:53: Error: bad register name `%rsi)'
invntt.s: invntt.s:54: Error: bad register name `%rsi)'
invntt.s: invntt.s:55: Error: bad register name `%rsi)'
invntt.s: invntt.s:58: Error: bad register name `%ymm8'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:66: Error: bad register name `%ymm8'
invntt.s: invntt.s:67: Error: bad register name `%ymm10'
invntt.s: invntt.s:70: Error: bad register name `%rdx)'
invntt.s: invntt.s:71: Error: bad register name `%rdx)'
invntt.s: invntt.s:72: Error: bad register name `%ymm12'
invntt.s: invntt.s:73: Error: bad register name `%ymm13'
invntt.s: invntt.s:74: Error: bad register name `%ymm8'
invntt.s: invntt.s:76: Error: bad register name `%ymm12'
invntt.s: invntt.s:77: Error: bad register name `%ymm13'
invntt.s: invntt.s:78: Error: bad register name `%ymm9'
invntt.s: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer avx2