Implementation notes: x86, samba, crypto_sign/dilithium3aes

Computer: samba
Architecture: x86
CPU ID: GenuineIntel-000506e3-bfebfbff
SUPERCOP version: 20190803
Operation: crypto_sign
Primitive: dilithium3aes
TimeImplementationCompilerBenchmark dateSUPERCOP version
7260506refgcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
7514460refgcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
7521123refgcc -funroll-loops -m32 -O3 -fomit-frame-pointer2019080520190803
7562242refgcc -funroll-loops -m32 -O2 -fomit-frame-pointer2019080520190803
7575195refgcc -m32 -march=core-avx2 -Os -fomit-frame-pointer2019080520190803
7584065refgcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
7585226refgcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
7600927refgcc -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
7604391refgcc -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
7614421refgcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
7614429refgcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer2019080520190803
7628254refgcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
7629399refgcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
7645167refgcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
7670316refgcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
7712519refgcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
7724168refgcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
7729899refgcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
7735129refgcc -m32 -march=corei7-avx -Os -fomit-frame-pointer2019080520190803
7742143refgcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
7749555refgcc -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
7761909refgcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
7762416refgcc -m32 -march=core-avx-i -Os -fomit-frame-pointer2019080520190803
7763480refgcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
7767054refgcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
7792786refgcc -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
7797113refgcc -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
7805152refgcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
7813391refgcc -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
7815853refgcc -funroll-loops -m32 -O -fomit-frame-pointer2019080520190803
7816538refgcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
7877573refgcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
7881163refgcc -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
7892752refgcc -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
7911881refgcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
7923843refgcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
7944145refgcc -m32 -O3 -fomit-frame-pointer2019080520190803
7959496refgcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
7965919refgcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
7981976refgcc -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
7986740refgcc -m32 -march=core2 -Os -fomit-frame-pointer2019080520190803
7988627refgcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer2019080520190803
8000992refgcc -m32 -march=corei7 -Os -fomit-frame-pointer2019080520190803
8015171refgcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer2019080520190803
8037836refgcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
8047637refgcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
8053401refgcc -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
8081002refgcc -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
8091502refgcc -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
8101173refgcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
8105710refgcc -m32 -O -fomit-frame-pointer2019080520190803
8123932refgcc -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
8124511refgcc -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
8127530refgcc -m32 -O2 -fomit-frame-pointer2019080520190803
8131032refgcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
8139556refgcc -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
8148167refgcc -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
8153663refgcc -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
8155308refgcc -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
8163405refgcc -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
8169423refgcc -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
8173240refgcc -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
8183059refgcc -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
8220101refgcc -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
8290595refgcc -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
8292496refgcc -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
8293292refgcc -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
8300856refgcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
8310981refgcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
8322941refgcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
8375911refgcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
8376528refgcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
8378068refgcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
8390298refgcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
8393749refgcc -m32 -march=core2 -O3 -fomit-frame-pointer2019080520190803
8401815refgcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer2019080520190803
8413568refgcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
8415313refgcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
8439383refgcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2019080520190803
8440633refgcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
8448470refgcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
8449208refgcc -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
8453187refgcc -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
8459058refgcc -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
8460369refgcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
8467777refgcc -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
8473889refgcc -m32 -Os -fomit-frame-pointer2019080520190803
8520266refgcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
8531041refgcc -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
8536465refgcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
8539703refgcc -m32 -march=corei7 -O3 -fomit-frame-pointer2019080520190803
8547532refgcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
8555514refgcc -funroll-loops -m32 -Os -fomit-frame-pointer2019080520190803
8563068refgcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
8580867refgcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
8581167refgcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
8581488refgcc -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
8584134refgcc -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
8589198refgcc -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
8631504refgcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
8634065refgcc -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
8641984refgcc -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
8644044refgcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
8644641refgcc -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
8647851refgcc -m32 -march=core-avx-i -O -fomit-frame-pointer2019080520190803
8647921refgcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer2019080520190803
8651962refgcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer2019080520190803
8654683refgcc -m32 -march=core-avx2 -O -fomit-frame-pointer2019080520190803
8671900refgcc -m32 -march=core2 -O -fomit-frame-pointer2019080520190803
8680113refgcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
8680871refgcc -m32 -march=corei7-avx -O -fomit-frame-pointer2019080520190803
8682131refgcc -m32 -march=corei7 -O -fomit-frame-pointer2019080520190803
8713733refgcc -m32 -march=corei7 -O2 -fomit-frame-pointer2019080520190803
8719309refgcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2019080520190803
8730689refgcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
8758052refgcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
8764437refgcc -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
8765318refgcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2019080520190803
8807250refgcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer2019080520190803
8809345refgcc -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
8814718refgcc -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
8826046refgcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
8839361refgcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
8841748refgcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
8843893refgcc -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
8856566refgcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
8859460refgcc -m32 -march=core2 -O2 -fomit-frame-pointer2019080520190803
8859491refgcc -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
8859971refgcc -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
8863887refgcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
8880984refgcc -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
8898805refgcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
8929501refgcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer2019080520190803
8948168refgcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
8955985refgcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2019080520190803
8957032refgcc -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
8972447refgcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2019080520190803
9022792refgcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer2019080520190803
9033681refgcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
9051524refgcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2019080520190803
9080155refgcc -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
9087611refgcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer2019080520190803
9122515refgcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
9178666refgcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2019080520190803
9216703refgcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer2019080520190803
9310077refgcc -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
9312833refgcc -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
9337076refgcc -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
9354782refgcc -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
9358801refgcc -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
9365295refgcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
9370234refgcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
9377036refgcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
9426472refgcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
9436213refgcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
9484355refgcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
9594585refgcc -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
9603424refgcc -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
9754411refgcc -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
9817423refgcc -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
11000524refgcc -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
11135972refgcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
11303236refgcc -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
11424357refgcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
11600950refgcc -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
12302938refgcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
12666590refgcc -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
12743486refgcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
13875845refgcc -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
13893871refgcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
14258512refgcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
14274232refgcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
14344175refgcc -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
15034539refgcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803
15586825refgcc -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
15614199refgcc -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803

Compiler output

Implementation: crypto_sign/dilithium3aes/avx2
Compiler: gcc -funroll-loops -m32 -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 156, namely:
CompilerImplementations
gcc -funroll-loops -m32 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer avx2
gcc -m32 -O2 -fomit-frame-pointer avx2
gcc -m32 -O3 -fomit-frame-pointer avx2
gcc -m32 -O -fomit-frame-pointer avx2
gcc -m32 -Os -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -m32 -march=prescott -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium3aes/avx2
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium3aes/avx2
Compiler: gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:142:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+2], lanes2 ),\
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium3aes/avx2
Compiler: gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer
invntt.s: invntt.s: Assembler messages:
invntt.s: invntt.s:47: Error: bad register name `%rip)'
invntt.s: invntt.s:48: Error: bad register name `%rip)'
invntt.s: invntt.s:49: Error: bad register name `%rip)'
invntt.s: invntt.s:52: Error: bad register name `%rsi)'
invntt.s: invntt.s:53: Error: bad register name `%rsi)'
invntt.s: invntt.s:54: Error: bad register name `%rsi)'
invntt.s: invntt.s:55: Error: bad register name `%rsi)'
invntt.s: invntt.s:58: Error: bad register name `%ymm8'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:66: Error: bad register name `%ymm8'
invntt.s: invntt.s:67: Error: bad register name `%ymm10'
invntt.s: invntt.s:70: Error: bad register name `%rdx)'
invntt.s: invntt.s:71: Error: bad register name `%rdx)'
invntt.s: invntt.s:72: Error: bad register name `%ymm12'
invntt.s: invntt.s:73: Error: bad register name `%ymm13'
invntt.s: invntt.s:74: Error: bad register name `%ymm8'
invntt.s: invntt.s:76: Error: bad register name `%ymm12'
invntt.s: invntt.s:77: Error: bad register name `%ymm13'
invntt.s: invntt.s:78: Error: bad register name `%ymm9'
invntt.s: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer avx2