Implementation notes: x86, titan0, crypto_kem/kyber512

Computer: titan0
Architecture: x86
CPU ID: GenuineIntel-000306c3-bfebfbff
SUPERCOP version: 20190803
Operation: crypto_kem
Primitive: kyber512
TimeImplementationCompilerBenchmark dateSUPERCOP version
609776refgcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer2019081020190803
626424refgcc -m32 -march=pentium-m -O3 -fomit-frame-pointer2019081020190803
626708refgcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer2019081020190803
633560refgcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer2019081020190803
635820refgcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2019081020190803
637052refgcc -m32 -march=corei7 -O3 -fomit-frame-pointer2019081020190803
641104refgcc -m32 -march=core2 -O3 -fomit-frame-pointer2019081020190803
651200refgcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer2019081020190803
651504refgcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer2019081020190803
668528refgcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer2019081020190803
671736refgcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer2019081020190803
676136refgcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer2019081020190803
681024refgcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer2019081020190803
683220refgcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer2019081020190803
684532refgcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer2019081020190803
687128refgcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer2019081020190803
688640refgcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer2019081020190803
689812refgcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer2019081020190803
691372refgcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer2019081020190803
692200refgcc -m32 -march=pentium-m -O2 -fomit-frame-pointer2019081020190803
693024refgcc -m32 -march=corei7 -O2 -fomit-frame-pointer2019081020190803
694992refgcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2019081020190803
715716refgcc -m32 -march=core2 -O2 -fomit-frame-pointer2019081020190803
718700refgcc -m32 -march=native -mtune=native -O -fomit-frame-pointer2019081020190803
719624refgcc -m32 -march=core-avx2 -Os -fomit-frame-pointer2019081020190803
720064refgcc -m32 -march=pentium-m -O -fomit-frame-pointer2019081020190803
721784refgcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer2019081020190803
723936refgcc -m32 -march=core-avx2 -O -fomit-frame-pointer2019081020190803
730808refgcc -m32 -march=core-avx-i -Os -fomit-frame-pointer2019081020190803
732072refgcc -m32 -march=k8 -O -fomit-frame-pointer2019081020190803
733288refgcc -m32 -march=corei7-avx -Os -fomit-frame-pointer2019081020190803
737392refgcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer2019081020190803
745896refgcc -m32 -march=core-avx-i -O -fomit-frame-pointer2019081020190803
747384refgcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer2019081020190803
748244refgcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer2019081020190803
749816refgcc -m32 -march=barcelona -O -fomit-frame-pointer2019081020190803
750972refgcc -m32 -march=core2 -O -fomit-frame-pointer2019081020190803
751608refgcc -m32 -march=corei7 -O -fomit-frame-pointer2019081020190803
753092refgcc -m32 -march=corei7-avx -O -fomit-frame-pointer2019081020190803
753800refgcc -m32 -march=athlon -O3 -fomit-frame-pointer2019081020190803
754416refgcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer2019081020190803
755620refgcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer2019081020190803
755664refgcc -funroll-loops -m32 -O3 -fomit-frame-pointer2019081020190803
758048refgcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer2019081020190803
759060refgcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer2019081020190803
759932refgcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019081020190803
762356refgcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer2019081020190803
762600refgcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer2019081020190803
762640refgcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer2019081020190803
763724refgcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer2019081020190803
766408refgcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer2019081020190803
768664refgcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer2019081020190803
770312refgcc -m32 -march=prescott -O3 -fomit-frame-pointer2019081020190803
772812refgcc -m32 -march=nocona -O3 -fomit-frame-pointer2019081020190803
774348refgcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer2019081020190803
776228refgcc -funroll-loops -m32 -O2 -fomit-frame-pointer2019081020190803
776832refgcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer2019081020190803
780424refgcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer2019081020190803
780640refgcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019081020190803
783968refgcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer2019081020190803
784920refgcc -m32 -O3 -fomit-frame-pointer2019081020190803
785572refgcc -funroll-loops -m32 -O -fomit-frame-pointer2019081020190803
788144refgcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer2019081020190803
789196refgcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer2019081020190803
790000refgcc -m32 -march=pentium3 -O3 -fomit-frame-pointer2019081020190803
792204refgcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer2019081020190803
792248refgcc -m32 -march=pentium2 -O3 -fomit-frame-pointer2019081020190803
792544refgcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer2019081020190803
796156refgcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer2019081020190803
806712refgcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer2019081020190803
807224refgcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer2019081020190803
812924refgcc -m32 -march=k6-3 -O3 -fomit-frame-pointer2019081020190803
814732refgcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer2019081020190803
815944refgcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer2019081020190803
818432refgcc -m32 -march=pentium2 -O -fomit-frame-pointer2019081020190803
822108refgcc -m32 -march=k6-2 -O3 -fomit-frame-pointer2019081020190803
822136refgcc -m32 -march=pentium3 -O -fomit-frame-pointer2019081020190803
822704refgcc -m32 -march=pentium-m -Os -fomit-frame-pointer2019081020190803
825072refgcc -m32 -march=pentiumpro -O -fomit-frame-pointer2019081020190803
827012refgcc -m32 -march=nocona -O2 -fomit-frame-pointer2019081020190803
827572refgcc -m32 -march=corei7 -Os -fomit-frame-pointer2019081020190803
828576refgcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer2019081020190803
828740refgcc -m32 -march=athlon -O -fomit-frame-pointer2019081020190803
829708refgcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer2019081020190803
829844refgcc -m32 -march=prescott -O -fomit-frame-pointer2019081020190803
829916refgcc -m32 -march=prescott -O2 -fomit-frame-pointer2019081020190803
830776refgcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer2019081020190803
831852refgcc -m32 -march=k6 -O3 -fomit-frame-pointer2019081020190803
832092refgcc -m32 -march=nocona -O -fomit-frame-pointer2019081020190803
832904refgcc -m32 -march=prescott -Os -fomit-frame-pointer2019081020190803
833388refgcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer2019081020190803
835708refgcc -m32 -march=core2 -Os -fomit-frame-pointer2019081020190803
835720refgcc -m32 -march=nocona -Os -fomit-frame-pointer2019081020190803
835736refgcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer2019081020190803
836980refgcc -m32 -march=pentium4 -O3 -fomit-frame-pointer2019081020190803
837844refgcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer2019081020190803
837856refgcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer2019081020190803
838120refgcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer2019081020190803
840460refgcc -m32 -march=pentium4 -Os -fomit-frame-pointer2019081020190803
840868refgcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer2019081020190803
843752refgcc -m32 -march=athlon -O2 -fomit-frame-pointer2019081020190803
848360refgcc -m32 -march=pentium2 -O2 -fomit-frame-pointer2019081020190803
850732refgcc -m32 -march=pentium3 -O2 -fomit-frame-pointer2019081020190803
851056refgcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer2019081020190803
852052refgcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer2019081020190803
854264refgcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer2019081020190803
856548refgcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer2019081020190803
856756refgcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer2019081020190803
856960refgcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer2019081020190803
857096refgcc -m32 -O -fomit-frame-pointer2019081020190803
857572refgcc -m32 -march=k6-2 -Os -fomit-frame-pointer2019081020190803
859824refgcc -m32 -O2 -fomit-frame-pointer2019081020190803
860112refgcc -m32 -march=k6-3 -Os -fomit-frame-pointer2019081020190803
864944refgcc -m32 -march=k6 -Os -fomit-frame-pointer2019081020190803
866732refgcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer2019081020190803
870508refgcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019081020190803
870692refgcc -funroll-loops -m32 -Os -fomit-frame-pointer2019081020190803
871060refgcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer2019081020190803
871220refgcc -m32 -march=k6-2 -O -fomit-frame-pointer2019081020190803
872564refgcc -m32 -march=k6 -O -fomit-frame-pointer2019081020190803
873740refgcc -m32 -march=pentium -Os -fomit-frame-pointer2019081020190803
874244refgcc -m32 -march=i386 -Os -fomit-frame-pointer2019081020190803
875328refgcc -m32 -march=athlon -Os -fomit-frame-pointer2019081020190803
876224refgcc -m32 -march=i486 -Os -fomit-frame-pointer2019081020190803
876968refgcc -m32 -march=k6-3 -O -fomit-frame-pointer2019081020190803
877784refgcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer2019081020190803
878572refgcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer2019081020190803
879736refgcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer2019081020190803
883224refgcc -m32 -march=pentium2 -Os -fomit-frame-pointer2019081020190803
884376refgcc -m32 -Os -fomit-frame-pointer2019081020190803
884576refgcc -m32 -march=pentiumpro -Os -fomit-frame-pointer2019081020190803
885232refgcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer2019081020190803
885624refgcc -m32 -march=pentium3 -Os -fomit-frame-pointer2019081020190803
891412refgcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer2019081020190803
892456refgcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer2019081020190803
903352refgcc -m32 -march=pentium4 -O -fomit-frame-pointer2019081020190803
908008refgcc -m32 -march=k6-2 -O2 -fomit-frame-pointer2019081020190803
909748refgcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer2019081020190803
910124refgcc -m32 -march=pentium4 -O2 -fomit-frame-pointer2019081020190803
911708refgcc -m32 -march=k6-3 -O2 -fomit-frame-pointer2019081020190803
914980refgcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer2019081020190803
916672refgcc -m32 -march=k6 -O2 -fomit-frame-pointer2019081020190803
918544refgcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer2019081020190803
932708refgcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer2019081020190803
936836refgcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer2019081020190803
951260refgcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer2019081020190803
955424refgcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer2019081020190803
957608refgcc -m32 -march=i486 -O3 -fomit-frame-pointer2019081020190803
963692refgcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer2019081020190803
968880refgcc -m32 -march=i386 -O3 -fomit-frame-pointer2019081020190803
971572refgcc -m32 -march=pentium -O -fomit-frame-pointer2019081020190803
979296refgcc -m32 -march=pentium-mmx -O -fomit-frame-pointer2019081020190803
997344refgcc -m32 -march=i386 -O -fomit-frame-pointer2019081020190803
1002712refgcc -m32 -march=i486 -O -fomit-frame-pointer2019081020190803
1023464refgcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer2019081020190803
1025968refgcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019081020190803
1028680refgcc -m32 -march=i386 -O2 -fomit-frame-pointer2019081020190803
1034408refgcc -m32 -march=i486 -O2 -fomit-frame-pointer2019081020190803
1042664refgcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019081020190803
1047824refgcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer2019081020190803
1052200refgcc -m32 -march=pentium -O3 -fomit-frame-pointer2019081020190803
1061472refgcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer2019081020190803
1107724refgcc -m32 -march=pentium -O2 -fomit-frame-pointer2019081020190803
1108532refgcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer2019081020190803
1332600refgcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer2019081020190803
1349040refgcc -m32 -march=barcelona -O3 -fomit-frame-pointer2019081020190803
1377372refgcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer2019081020190803
1378840refgcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer2019081020190803
1383516refgcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer2019081020190803
1407692refgcc -m32 -march=k8 -O3 -fomit-frame-pointer2019081020190803
1453092refgcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer2019081020190803
1454812refgcc -m32 -march=k8 -O2 -fomit-frame-pointer2019081020190803
1468112refgcc -m32 -march=barcelona -Os -fomit-frame-pointer2019081020190803
1468148refgcc -m32 -march=barcelona -O2 -fomit-frame-pointer2019081020190803
1492796refgcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer2019081020190803
1494540refgcc -m32 -march=k8 -Os -fomit-frame-pointer2019081020190803

Compiler output

Implementation: crypto_kem/kyber512/avx2
Compiler: gcc -funroll-loops -m32 -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 156, namely:
CompilerImplementations
gcc -funroll-loops -m32 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=barcelona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -funroll-loops -m32 -march=prescott -Os -fomit-frame-pointer avx2
gcc -m32 -O2 -fomit-frame-pointer avx2
gcc -m32 -O3 -fomit-frame-pointer avx2
gcc -m32 -O -fomit-frame-pointer avx2
gcc -m32 -Os -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O2 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O3 -fomit-frame-pointer avx2
gcc -m32 -march=athlon -O -fomit-frame-pointer avx2
gcc -m32 -march=athlon -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4.1 -Os -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -O -fomit-frame-pointer avx2
gcc -m32 -march=core2 -msse4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i386 -O -fomit-frame-pointer avx2
gcc -m32 -march=i386 -Os -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=i486 -O -fomit-frame-pointer avx2
gcc -m32 -march=i486 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6-3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k6 -O -fomit-frame-pointer avx2
gcc -m32 -march=k6 -Os -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=k8 -O -fomit-frame-pointer avx2
gcc -m32 -march=k8 -Os -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=nocona -O -fomit-frame-pointer avx2
gcc -m32 -march=nocona -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-m -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium-mmx -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium3 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium4 -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentium -O -fomit-frame-pointer avx2
gcc -m32 -march=pentium -Os -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O2 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O3 -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -O -fomit-frame-pointer avx2
gcc -m32 -march=pentiumpro -Os -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O2 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O3 -fomit-frame-pointer avx2
gcc -m32 -march=prescott -O -fomit-frame-pointer avx2
gcc -m32 -march=prescott -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_kem/kyber512/avx2
Compiler: gcc -m32 -march=barcelona -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:40: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
KeccakP-1600-times4-SIMD256.c: #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c: ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m32 -march=barcelona -O2 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O3 -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -O -fomit-frame-pointer avx2
gcc -m32 -march=barcelona -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_kem/kyber512/avx2
Compiler: gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c: In function 'KeccakP1600times4_AddLanesAll':
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:143:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+3], lanes3 )
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:149:9: note: in expansion of macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: Xor_In4( 12 );
KeccakP-1600-times4-SIMD256.c: ^~~~~~~
KeccakP-1600-times4-SIMD256.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
KeccakP-1600-times4-SIMD256.c: from KeccakP-1600-times4-SIMD256.c:21:
KeccakP-1600-times4-SIMD256.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline '_mm256_xor_si256': target specific option mismatch
KeccakP-1600-times4-SIMD256.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:55:41: note: called from here
KeccakP-1600-times4-SIMD256.c: #define XOReq256(a, b) a = _mm256_xor_si256(a, b)
KeccakP-1600-times4-SIMD256.c: ^~~~~~~~~~~~~~~~~~~~~~
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:142:33: note: in expansion of macro 'XOReq256'
KeccakP-1600-times4-SIMD256.c: XOReq256( stateAsLanes[argIndex+2], lanes2 ),\
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx-i -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx-i -Os -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O2 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O3 -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -O -fomit-frame-pointer avx2
gcc -m32 -march=corei7-avx -Os -fomit-frame-pointer avx2

Compiler output

Implementation: crypto_kem/kyber512/avx2
Compiler: gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer
basemul.S: basemul.S: Assembler messages:
basemul.S: basemul.S:79: Error: bad register name `%rip)'
basemul.S: basemul.S:80: Error: bad register name `%rip)'
basemul.S: basemul.S:81: Error: bad register name `%rcx)'
basemul.S: basemul.S:84: Error: bad register name `%rsi)'
basemul.S: basemul.S:84: Error: bad register name `%rdx)'
basemul.S: basemul.S:84: Error: bad register name `%rsi)'
basemul.S: basemul.S:84: Error: bad register name `%rdx)'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: basemul.S:84: Error: bad register name `%ymm10'
basemul.S: basemul.S:84: Error: bad register name `%ymm10'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm9'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm11'
basemul.S: basemul.S:84: Error: bad register name `%ymm13'
basemul.S: basemul.S:84: Error: bad register name `%ymm13'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: basemul.S:84: Error: bad register name `%ymm8'
basemul.S: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -m32 -march=core-avx2 -O2 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O3 -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -O -fomit-frame-pointer avx2
gcc -m32 -march=core-avx2 -Os -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O2 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O3 -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -O -fomit-frame-pointer avx2
gcc -m32 -march=native -mtune=native -Os -fomit-frame-pointer avx2