Implementation notes: x86, titan0, crypto_sign/dilithium2aes

Computer: titan0
Architecture: x86
CPU ID: GenuineIntel-000306c3-bfebfbff
SUPERCOP version: 20190803
Operation: crypto_sign
Primitive: dilithium2aes
TimeImplementationCompilerBenchmark dateSUPERCOP version
4191764refgcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
4234496refgcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
4305468refgcc -funroll-loops -m32 -O3 -fomit-frame-pointer2019080520190803
4325176refgcc -funroll-loops -m32 -O2 -fomit-frame-pointer2019080520190803
4335588refgcc -m32 -march=core-avx2 -Os -fomit-frame-pointer2019080520190803
4346264refgcc -m32 -march=athlon -O3 -fomit-frame-pointer2019080520190803
4354168refgcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer2019080520190803
4386868refgcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
4388928refgcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
4391476refgcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
4401268refgcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
4404688refgcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
4406320refgcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
4415680refgcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
4419860refgcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
4426536refgcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
4430484refgcc -m32 -march=athlon -O2 -fomit-frame-pointer2019080520190803
4431512refgcc -m32 -march=core-avx-i -Os -fomit-frame-pointer2019080520190803
4434408refgcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
4436592refgcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
4443804refgcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
4448568refgcc -m32 -march=corei7-avx -Os -fomit-frame-pointer2019080520190803
4450056refgcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
4454152refgcc -m32 -march=k6-2 -O3 -fomit-frame-pointer2019080520190803
4457656refgcc -m32 -march=k6-3 -O3 -fomit-frame-pointer2019080520190803
4458468refgcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
4480064refgcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
4481280refgcc -funroll-loops -m32 -O -fomit-frame-pointer2019080520190803
4490268refgcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
4490976refgcc -m32 -march=athlon -O -fomit-frame-pointer2019080520190803
4505240refgcc -m32 -march=pentium3 -O -fomit-frame-pointer2019080520190803
4506064refgcc -m32 -march=pentiumpro -O -fomit-frame-pointer2019080520190803
4513404refgcc -m32 -O3 -fomit-frame-pointer2019080520190803
4515084refgcc -m32 -march=pentium2 -O -fomit-frame-pointer2019080520190803
4545004refgcc -m32 -march=pentium2 -O3 -fomit-frame-pointer2019080520190803
4556876refgcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
4565384refgcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
4566084refgcc -m32 -march=pentium3 -O3 -fomit-frame-pointer2019080520190803
4566092refgcc -m32 -O2 -fomit-frame-pointer2019080520190803
4573460refgcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
4575012refgcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019080520190803
4588072refgcc -m32 -march=k6 -O3 -fomit-frame-pointer2019080520190803
4600156refgcc -m32 -march=core2 -O3 -fomit-frame-pointer2019080520190803
4602072refgcc -m32 -march=pentium2 -O2 -fomit-frame-pointer2019080520190803
4605068refgcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2019080520190803
4614844refgcc -m32 -march=pentium3 -O2 -fomit-frame-pointer2019080520190803
4614936refgcc -m32 -O -fomit-frame-pointer2019080520190803
4620760refgcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer2019080520190803
4622160refgcc -m32 -march=corei7 -O3 -fomit-frame-pointer2019080520190803
4628772refgcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019080520190803
4659628refgcc -m32 -march=k6-3 -O2 -fomit-frame-pointer2019080520190803
4664240refgcc -m32 -march=k6-2 -O2 -fomit-frame-pointer2019080520190803
4669160refgcc -m32 -march=k6 -O -fomit-frame-pointer2019080520190803
4692504refgcc -m32 -march=k6 -O2 -fomit-frame-pointer2019080520190803
4698728refgcc -m32 -march=k6-3 -O -fomit-frame-pointer2019080520190803
4709916refgcc -m32 -march=k6-2 -O -fomit-frame-pointer2019080520190803
4760440refgcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
4774844refgcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
4776964refgcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
4782332refgcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
4782796refgcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
4785220refgcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
4788224refgcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
4794532refgcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer2019080520190803
4808280refgcc -m32 -march=k6 -Os -fomit-frame-pointer2019080520190803
4808696refgcc -m32 -march=core2 -O2 -fomit-frame-pointer2019080520190803
4813480refgcc -m32 -march=k6-2 -Os -fomit-frame-pointer2019080520190803
4818444refgcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2019080520190803
4819504refgcc -m32 -march=corei7 -O2 -fomit-frame-pointer2019080520190803
4824484refgcc -m32 -march=k6-3 -Os -fomit-frame-pointer2019080520190803
4833688refgcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
4835176refgcc -funroll-loops -m32 -Os -fomit-frame-pointer2019080520190803
4835664refgcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
4837248refgcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
4837260refgcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
4841932refgcc -m32 -march=pentium4 -O3 -fomit-frame-pointer2019080520190803
4851616refgcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
4851716refgcc -m32 -march=pentium -Os -fomit-frame-pointer2019080520190803
4857624refgcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
4860124refgcc -m32 -march=pentium2 -Os -fomit-frame-pointer2019080520190803
4862060refgcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019080520190803
4864940refgcc -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
4865028refgcc -m32 -march=pentium-m -O -fomit-frame-pointer2019080520190803
4867076refgcc -m32 -Os -fomit-frame-pointer2019080520190803
4870340refgcc -m32 -march=k8 -O -fomit-frame-pointer2019080520190803
4871404refgcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
4872332refgcc -m32 -march=pentium3 -Os -fomit-frame-pointer2019080520190803
4874044refgcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2019080520190803
4893372refgcc -m32 -march=athlon -Os -fomit-frame-pointer2019080520190803
4894336refgcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer2019080520190803
4899048refgcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer2019080520190803
4901772refgcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2019080520190803
4921816refgcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer2019080520190803
4924828refgcc -m32 -march=core-avx2 -O -fomit-frame-pointer2019080520190803
4932876refgcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2019080520190803
4934924refgcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2019080520190803
4936000refgcc -m32 -march=pentium4 -O -fomit-frame-pointer2019080520190803
4937296refgcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2019080520190803
4938832refgcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer2019080520190803
4939040refgcc -m32 -march=core2 -O -fomit-frame-pointer2019080520190803
4939648refgcc -m32 -march=corei7-avx -O -fomit-frame-pointer2019080520190803
4941816refgcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer2019080520190803
4943580refgcc -m32 -march=corei7 -O -fomit-frame-pointer2019080520190803
4944408refgcc -m32 -march=barcelona -O -fomit-frame-pointer2019080520190803
4950344refgcc -m32 -march=core-avx-i -O -fomit-frame-pointer2019080520190803
4956292refgcc -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
4957020refgcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer2019080520190803
4961408refgcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
4963168refgcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer2019080520190803
4973604refgcc -m32 -march=i386 -Os -fomit-frame-pointer2019080520190803
4987036refgcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer2019080520190803
5024584refgcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
5044368refgcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
5044760refgcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
5046844refgcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
5055088refgcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
5063812refgcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
5065236refgcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
5068132refgcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
5075832refgcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
5080420refgcc -m32 -march=pentium4 -O2 -fomit-frame-pointer2019080520190803
5090160refgcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
5091412refgcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
5099320refgcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
5105540refgcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
5113220refgcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
5121564refgcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
5130544refgcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
5130964refgcc -m32 -march=core2 -Os -fomit-frame-pointer2019080520190803
5137044refgcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
5156936refgcc -m32 -march=pentium-m -Os -fomit-frame-pointer2019080520190803
5166612refgcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer2019080520190803
5168464refgcc -m32 -march=prescott -O3 -fomit-frame-pointer2019080520190803
5169116refgcc -m32 -march=nocona -O -fomit-frame-pointer2019080520190803
5170148refgcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer2019080520190803
5173480refgcc -m32 -march=nocona -O3 -fomit-frame-pointer2019080520190803
5178692refgcc -m32 -march=prescott -O -fomit-frame-pointer2019080520190803
5178736refgcc -m32 -march=corei7 -Os -fomit-frame-pointer2019080520190803
5180604refgcc -m32 -march=prescott -Os -fomit-frame-pointer2019080520190803
5183532refgcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
5185588refgcc -m32 -march=pentium4 -Os -fomit-frame-pointer2019080520190803
5189272refgcc -m32 -march=pentium-mmx -O -fomit-frame-pointer2019080520190803
5192540refgcc -m32 -march=nocona -Os -fomit-frame-pointer2019080520190803
5201940refgcc -m32 -march=pentium -O -fomit-frame-pointer2019080520190803
5203924refgcc -m32 -march=i486 -O3 -fomit-frame-pointer2019080520190803
5210968refgcc -m32 -march=i486 -O -fomit-frame-pointer2019080520190803
5225704refgcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
5261908refgcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
5275116refgcc -m32 -march=i486 -O2 -fomit-frame-pointer2019080520190803
5277656refgcc -m32 -march=pentium -O3 -fomit-frame-pointer2019080520190803
5290092refgcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019080520190803
5323252refgcc -m32 -march=pentium -O2 -fomit-frame-pointer2019080520190803
5327100refgcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019080520190803
5354500refgcc -m32 -march=prescott -O2 -fomit-frame-pointer2019080520190803
5361068refgcc -m32 -march=i386 -O -fomit-frame-pointer2019080520190803
5382100refgcc -m32 -march=nocona -O2 -fomit-frame-pointer2019080520190803
5466508refgcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
5491312refgcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
5605684refgcc -m32 -march=i386 -O3 -fomit-frame-pointer2019080520190803
5671188refgcc -m32 -march=i386 -O2 -fomit-frame-pointer2019080520190803
6096396refgcc -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
6570860refgcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
6601188refgcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
6627048refgcc -m32 -march=pentium-m -O3 -fomit-frame-pointer2019080520190803
6715964refgcc -m32 -march=barcelona -Os -fomit-frame-pointer2019080520190803
6769828refgcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
6798860refgcc -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
6804860refgcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
6806888refgcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer2019080520190803
6935720refgcc -m32 -march=k8 -O3 -fomit-frame-pointer2019080520190803
7124936refgcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer2019080520190803
7414728refgcc -m32 -march=pentium-m -O2 -fomit-frame-pointer2019080520190803
7716092refgcc -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
7767116refgcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer2019080520190803
8044780refgcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803
8450580refgcc -m32 -march=barcelona -O2 -fomit-frame-pointer2019080520190803

Compiler output

Implementation: crypto_sign/dilithium2aes/avx2
Compiler: gcc -funroll-loops -m32 -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 156, namely:
CompilerImplementations
gcc -funroll-loops -m32 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer avx2
gcc -m32 -O2 -fomit-frame-pointer avx2
gcc -m32 -O3 -fomit-frame-pointer avx2
gcc -m32 -O -fomit-frame-pointer avx2
gcc -m32 -Os -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -m32 -march=prescott -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium2aes/avx2
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium2aes/avx2
Compiler: gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:142:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+2], lanes2 ),\
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_sign/dilithium2aes/avx2
Compiler: gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer
invntt.s: invntt.s: Assembler messages:
invntt.s: invntt.s:47: Error: bad register name `%rip)'
invntt.s: invntt.s:48: Error: bad register name `%rip)'
invntt.s: invntt.s:49: Error: bad register name `%rip)'
invntt.s: invntt.s:52: Error: bad register name `%rsi)'
invntt.s: invntt.s:53: Error: bad register name `%rsi)'
invntt.s: invntt.s:54: Error: bad register name `%rsi)'
invntt.s: invntt.s:55: Error: bad register name `%rsi)'
invntt.s: invntt.s:58: Error: bad register name `%ymm8'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:59: Error: bad register name `%ymm10'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:61: Error: bad register name `%ymm8'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:62: Error: bad register name `%ymm10'
invntt.s: invntt.s:66: Error: bad register name `%ymm8'
invntt.s: invntt.s:67: Error: bad register name `%ymm10'
invntt.s: invntt.s:70: Error: bad register name `%rdx)'
invntt.s: invntt.s:71: Error: bad register name `%rdx)'
invntt.s: invntt.s:72: Error: bad register name `%ymm12'
invntt.s: invntt.s:73: Error: bad register name `%ymm13'
invntt.s: invntt.s:74: Error: bad register name `%ymm8'
invntt.s: invntt.s:76: Error: bad register name `%ymm12'
invntt.s: invntt.s:77: Error: bad register name `%ymm13'
invntt.s: invntt.s:78: Error: bad register name `%ymm9'
invntt.s: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer avx2