Implementation notes: amd64, skylake, crypto_aead/twine80n6t4clocv2

Computer: skylake
Architecture: amd64
CPU ID: GenuineIntel-000506e3-bfebfbff
SUPERCOP version: 20161026
Operation: crypto_aead
Primitive: twine80n6t4clocv2
TimeImplementationCompilerBenchmark dateSUPERCOP version
97150vpermgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016121720161026
97466vpermgcc -m64 -march=core-avx-i -O3 -fomit-frame-pointer2016121720161026
97566vpermgcc -m64 -march=corei7-avx -O3 -fomit-frame-pointer2016121720161026
97584vpermgcc -m64 -march=core-avx2 -Os -fomit-frame-pointer2016121720161026
97690vpermgcc -m64 -march=core2 -msse4 -O3 -fomit-frame-pointer2016121720161026
97734vpermgcc -m64 -march=native -mtune=native -Os -fomit-frame-pointer2016121720161026
97766vpermgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016121720161026
97774vpermgcc -m64 -march=corei7-avx -Os -fomit-frame-pointer2016121720161026
97784vpermgcc -m64 -march=core-avx-i -Os -fomit-frame-pointer2016121720161026
97822vpermgcc -m64 -march=corei7-avx -O -fomit-frame-pointer2016121720161026
97868vpermgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016121720161026
97876vpermgcc -m64 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2016121720161026
97900vpermgcc -m64 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2016121720161026
97922vpermgcc -m64 -march=core-avx2 -O -fomit-frame-pointer2016121720161026
97948vpermgcc -m64 -march=native -mtune=native -O2 -fomit-frame-pointer2016121720161026
97952vpermgcc -m64 -march=core-avx-i -O -fomit-frame-pointer2016121720161026
97974vpermgcc -m64 -march=core-avx2 -O3 -fomit-frame-pointer2016121720161026
98208vpermgcc -m64 -march=core2 -O2 -fomit-frame-pointer2016121720161026
98260vpermgcc -m64 -march=corei7 -O2 -fomit-frame-pointer2016121720161026
98300vpermgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016121720161026
98302vpermgcc -m64 -march=core2 -msse4 -O2 -fomit-frame-pointer2016121720161026
98408vpermgcc -m64 -march=native -mtune=native -O -fomit-frame-pointer2016121720161026
98446vpermgcc -m64 -march=corei7-avx -O2 -fomit-frame-pointer2016121720161026
98448vpermgcc -m64 -march=core-avx-i -O2 -fomit-frame-pointer2016121720161026
98572vpermgcc -m64 -march=core-avx2 -O2 -fomit-frame-pointer2016121720161026
98610vpermgcc -m64 -march=native -mtune=native -O3 -fomit-frame-pointer2016121720161026
98690vpermgcc -m64 -march=core2 -O3 -fomit-frame-pointer2016121720161026
98934vpermgcc -m64 -march=corei7 -O3 -fomit-frame-pointer2016121720161026
99948vpermclang -O3 -fwrapv -mavx -fomit-frame-pointer -Qunused-arguments2016121720161026
100350vpermclang -O3 -fwrapv -mavx2 -fomit-frame-pointer -Qunused-arguments2016121720161026
100602vpermclang -O3 -fwrapv -march=x86-64 -mcpu=core-avx2 -mavx2 -maes -mpclmul -fomit-frame-pointer -Qunused-arguments2016121720161026
101538vpermclang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments2016121720161026
101740vpermclang -O3 -fwrapv -mavx -maes -mpclmul -fomit-frame-pointer -Qunused-arguments2016121720161026
101770vpermclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121720161026
106108vpermgcc -m64 -march=corei7 -O -fomit-frame-pointer2016121720161026
106140vpermgcc -m64 -march=core2 -O -fomit-frame-pointer2016121720161026
106206vpermgcc -m64 -march=core2 -msse4 -O -fomit-frame-pointer2016121720161026
106750vpermgcc -m64 -march=core2 -msse4.1 -O -fomit-frame-pointer2016121720161026
116382vpermgcc -m64 -march=core2 -msse4.1 -Os -fomit-frame-pointer2016121720161026
116438vpermgcc -m64 -march=core2 -Os -fomit-frame-pointer2016121720161026
116778vpermgcc -m64 -march=core2 -msse4 -Os -fomit-frame-pointer2016121720161026
118596vpermgcc -m64 -march=corei7 -Os -fomit-frame-pointer2016121720161026
760328refgcc -funroll-loops -march=k8 -O2 -fomit-frame-pointer2016121720161026
761092refgcc -funroll-loops -m64 -march=k8 -O2 -fomit-frame-pointer2016121720161026
763212refgcc -funroll-loops -m64 -march=barcelona -O2 -fomit-frame-pointer2016121720161026
764716refgcc -funroll-loops -march=barcelona -O2 -fomit-frame-pointer2016121720161026
765678refgcc -funroll-loops -O2 -fomit-frame-pointer2016121720161026
766444refgcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer2016121720161026
771852refgcc -funroll-loops -m64 -march=nocona -O2 -fomit-frame-pointer2016121720161026
772264refgcc -funroll-loops -m64 -O2 -fomit-frame-pointer2016121720161026
774810refgcc -O2 -fomit-frame-pointer2016121720161026
774826refgcc -fno-schedule-insns -O2 -fomit-frame-pointer2016121720161026
774860refgcc -m64 -march=nocona -O2 -fomit-frame-pointer2016121720161026
774894refgcc -march=nocona -O2 -fomit-frame-pointer2016121720161026
775052refgcc -funroll-loops -march=nocona -O2 -fomit-frame-pointer2016121720161026
775800refgcc -m64 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2016121720161026
776384refgcc -m64 -march=core2 -msse4 -O2 -fomit-frame-pointer2016121720161026
776640refgcc -m64 -march=corei7-avx -O2 -fomit-frame-pointer2016121720161026
776912refgcc -m64 -O2 -fomit-frame-pointer2016121720161026
777308refgcc -m64 -march=core-avx-i -O2 -fomit-frame-pointer2016121720161026
778854refgcc -m64 -march=corei7 -O2 -fomit-frame-pointer2016121720161026
779454refgcc -m64 -march=core2 -O2 -fomit-frame-pointer2016121720161026
784070refgcc -m64 -march=native -mtune=native -O2 -fomit-frame-pointer2016121720161026
784336refgcc -m64 -march=core-avx2 -O2 -fomit-frame-pointer2016121720161026
787002refgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016121720161026
789526refgcc -march=barcelona -O2 -fomit-frame-pointer2016121720161026
790158refgcc -m64 -march=barcelona -O2 -fomit-frame-pointer2016121720161026
796008refgcc -m64 -march=k8 -O2 -fomit-frame-pointer2016121720161026
799512refgcc -march=k8 -O2 -fomit-frame-pointer2016121720161026
931072refgcc -funroll-loops -m64 -march=barcelona -O -fomit-frame-pointer2016121720161026
931134refgcc -funroll-loops -O -fomit-frame-pointer2016121720161026
931772refgcc -funroll-loops -march=k8 -O -fomit-frame-pointer2016121720161026
932466refgcc -funroll-loops -m64 -march=k8 -O -fomit-frame-pointer2016121720161026
932528refgcc -funroll-loops -m64 -march=nocona -O -fomit-frame-pointer2016121720161026
934032refgcc -march=k8 -O -fomit-frame-pointer2016121720161026
935004refgcc -funroll-loops -march=nocona -O -fomit-frame-pointer2016121720161026
935378refgcc -O -fomit-frame-pointer2016121720161026
935934refgcc -funroll-loops -march=barcelona -O -fomit-frame-pointer2016121720161026
936202refgcc -m64 -march=core-avx2 -O -fomit-frame-pointer2016121720161026
936946refgcc -m64 -march=corei7-avx -O -fomit-frame-pointer2016121720161026
937138refgcc -funroll-loops -m64 -O -fomit-frame-pointer2016121720161026
937502refgcc -m64 -march=corei7 -O -fomit-frame-pointer2016121720161026
937530refgcc -m64 -march=native -mtune=native -O -fomit-frame-pointer2016121720161026
937606refgcc -march=barcelona -O -fomit-frame-pointer2016121720161026
937726refgcc -m64 -march=core-avx-i -O -fomit-frame-pointer2016121720161026
938114refgcc -m64 -march=core2 -msse4.1 -O -fomit-frame-pointer2016121720161026
938394refgcc -m64 -march=core2 -O -fomit-frame-pointer2016121720161026
938488refgcc -m64 -O -fomit-frame-pointer2016121720161026
940346refgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016121720161026
940678refgcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer2016121720161026
941078refgcc -m64 -march=k8 -O -fomit-frame-pointer2016121720161026
942248refgcc -march=nocona -O -fomit-frame-pointer2016121720161026
942750refgcc -fno-schedule-insns -O -fomit-frame-pointer2016121720161026
942762refgcc -m64 -march=nocona -O -fomit-frame-pointer2016121720161026
942816refgcc -m64 -march=barcelona -O -fomit-frame-pointer2016121720161026
948408refgcc -funroll-loops -m64 -march=k8 -O3 -fomit-frame-pointer2016121720161026
949300refgcc -funroll-loops -march=k8 -O3 -fomit-frame-pointer2016121720161026
949586refgcc -funroll-loops -march=barcelona -O3 -fomit-frame-pointer2016121720161026
953386refgcc -funroll-loops -m64 -march=barcelona -O3 -fomit-frame-pointer2016121720161026
964868refgcc -m64 -march=core2 -msse4 -O -fomit-frame-pointer2016121720161026
970892refgcc -march=k8 -O3 -fomit-frame-pointer2016121720161026
970898refgcc -m64 -march=barcelona -O3 -fomit-frame-pointer2016121720161026
971792refgcc -march=barcelona -O3 -fomit-frame-pointer2016121720161026
975754refgcc -m64 -march=k8 -O3 -fomit-frame-pointer2016121720161026
1055478refgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016121720161026
1057064refgcc -m64 -march=corei7-avx -O3 -fomit-frame-pointer2016121720161026
1057102refgcc -m64 -march=core-avx-i -O3 -fomit-frame-pointer2016121720161026
1057242refgcc -m64 -march=core-avx2 -O3 -fomit-frame-pointer2016121720161026
1058042refgcc -m64 -march=corei7 -O3 -fomit-frame-pointer2016121720161026
1058158refgcc -m64 -march=native -mtune=native -O3 -fomit-frame-pointer2016121720161026
1058420refgcc -m64 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2016121720161026
1058514refgcc -m64 -march=core2 -msse4 -O3 -fomit-frame-pointer2016121720161026
1169882refgcc -funroll-loops -m64 -O3 -fomit-frame-pointer2016121720161026
1171958refgcc -funroll-loops -march=nocona -O3 -fomit-frame-pointer2016121720161026
1172742refgcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer2016121720161026
1174184refgcc -funroll-loops -O3 -fomit-frame-pointer2016121720161026
1186224refgcc -funroll-loops -m64 -march=nocona -O3 -fomit-frame-pointer2016121720161026
1208192refgcc -m64 -march=nocona -O3 -fomit-frame-pointer2016121720161026
1209204refgcc -m64 -march=core2 -O3 -fomit-frame-pointer2016121720161026
1209640refgcc -m64 -O3 -fomit-frame-pointer2016121720161026
1209668refgcc -march=nocona -O3 -fomit-frame-pointer2016121720161026
1209876refgcc -fno-schedule-insns -O3 -fomit-frame-pointer2016121720161026
1210672refgcc -O3 -fomit-frame-pointer2016121720161026
1216066refgcc -funroll-loops -march=nocona -Os -fomit-frame-pointer2016121720161026
1217532refgcc -funroll-loops -m64 -march=nocona -Os -fomit-frame-pointer2016121720161026
1221078refgcc -funroll-loops -m64 -Os -fomit-frame-pointer2016121720161026
1223348refgcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer2016121720161026
1227574refgcc -funroll-loops -march=barcelona -Os -fomit-frame-pointer2016121720161026
1227702refgcc -funroll-loops -march=k8 -Os -fomit-frame-pointer2016121720161026
1230870refgcc -funroll-loops -m64 -march=barcelona -Os -fomit-frame-pointer2016121720161026
1240176refgcc -funroll-loops -m64 -march=k8 -Os -fomit-frame-pointer2016121720161026
1399860refgcc -funroll-loops -Os -fomit-frame-pointer2016121720161026
1598620refgcc2016121720161026
1612502refcc2016121720161026
1615188refgcc -funroll-loops2016121720161026
1710544refclang -O3 -fomit-frame-pointer -Qunused-arguments2016121720161026
1710636refclang -O3 -fwrapv -mavx -maes -mpclmul -fomit-frame-pointer -Qunused-arguments2016121720161026
1711474refclang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments2016121720161026
1711752refclang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121720161026
1714708refclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121720161026
1717504refclang -O3 -fwrapv -march=x86-64 -mcpu=core-avx2 -mavx2 -maes -mpclmul -fomit-frame-pointer -Qunused-arguments2016121720161026
1717838refclang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121720161026
1719738refclang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121720161026
1730308refclang -O3 -fwrapv -mavx2 -fomit-frame-pointer -Qunused-arguments2016121720161026
1743692refclang -O3 -fwrapv -mavx -fomit-frame-pointer -Qunused-arguments2016121720161026
1851298refgcc -march=nocona -Os -fomit-frame-pointer2016121720161026
1852892refgcc -m64 -march=nocona -Os -fomit-frame-pointer2016121720161026
1863250refgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016121720161026
1869646refgcc -march=k8 -Os -fomit-frame-pointer2016121720161026
1870300refgcc -m64 -march=core2 -msse4.1 -Os -fomit-frame-pointer2016121720161026
1871202refgcc -m64 -march=core2 -Os -fomit-frame-pointer2016121720161026
1871590refgcc -m64 -march=corei7-avx -Os -fomit-frame-pointer2016121720161026
1872740refgcc -m64 -march=native -mtune=native -Os -fomit-frame-pointer2016121720161026
1872876refgcc -m64 -march=core-avx2 -Os -fomit-frame-pointer2016121720161026
1873380refgcc -m64 -march=core-avx-i -Os -fomit-frame-pointer2016121720161026
1873638refgcc -Os -fomit-frame-pointer2016121720161026
1874188refgcc -m64 -march=barcelona -Os -fomit-frame-pointer2016121720161026
1874724refgcc -m64 -march=k8 -Os -fomit-frame-pointer2016121720161026
1874910refgcc -m64 -march=corei7 -Os -fomit-frame-pointer2016121720161026
1875730refgcc -march=barcelona -Os -fomit-frame-pointer2016121720161026
1875770refgcc -m64 -march=core2 -msse4 -Os -fomit-frame-pointer2016121720161026
1876142refgcc -m64 -Os -fomit-frame-pointer2016121720161026
1881156refgcc -fno-schedule-insns -Os -fomit-frame-pointer2016121720161026

Compiler output

Implementation: crypto_aead/twine80n6t4clocv2/vperm
Compiler: cc
encrypt.c: In file included from twine.h:7:0,
encrypt.c: from encrypt.c:3:
encrypt.c: twine.h: In function 'Encode':
encrypt.c: /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/tmmintrin.h:136:1: error: inlining failed in call to always_inline '_mm_shuffle_epi8': target specific option mismatch
encrypt.c: _mm_shuffle_epi8 (__m128i __X, __m128i __Y)
encrypt.c: ^~~~~~~~~~~~~~~~
encrypt.c: In file included from encrypt.c:3:0:
encrypt.c: twine.h:174:7: note: called from here
encrypt.c: _tmp = PSHUFB(state, _tmp); \
encrypt.c: ^
encrypt.c: twine.h:228:2: note: in expansion of macro 'twine80_enc'
encrypt.c: twine80_enc(state);
encrypt.c: ^~~~~~~~~~~
encrypt.c: In file included from twine.h:7:0,
encrypt.c: from encrypt.c:3:
encrypt.c: /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/tmmintrin.h:136:1: error: inlining failed in call to always_inline '_mm_shuffle_epi8': target specific option mismatch
encrypt.c: _mm_shuffle_epi8 (__m128i __X, __m128i __Y)
encrypt.c: ^~~~~~~~~~~~~~~~
encrypt.c: In file included from encrypt.c:3:0:
encrypt.c: twine.h:171:8: note: called from here
encrypt.c: right = PSHUFB(state, right); \
encrypt.c: ^
encrypt.c: twine.h:228:2: note: in expansion of macro 'twine80_enc'
encrypt.c: twine80_enc(state);
encrypt.c: ^~~~~~~~~~~
encrypt.c: ...

Number of similar (compiler,implementation) pairs: 71, namely:
CompilerImplementations
cc vperm
gcc vperm
gcc -O2 -fomit-frame-pointer vperm
gcc -O3 -fomit-frame-pointer vperm
gcc -O -fomit-frame-pointer vperm
gcc -Os -fomit-frame-pointer vperm
gcc -fno-schedule-insns -O2 -fomit-frame-pointer vperm
gcc -fno-schedule-insns -O3 -fomit-frame-pointer vperm
gcc -fno-schedule-insns -O -fomit-frame-pointer vperm
gcc -fno-schedule-insns -Os -fomit-frame-pointer vperm
gcc -funroll-loops vperm
gcc -funroll-loops -O2 -fomit-frame-pointer vperm
gcc -funroll-loops -O3 -fomit-frame-pointer vperm
gcc -funroll-loops -O -fomit-frame-pointer vperm
gcc -funroll-loops -Os -fomit-frame-pointer vperm
gcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer vperm
gcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer vperm
gcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer vperm
gcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer vperm
gcc -funroll-loops -m64 -O2 -fomit-frame-pointer vperm
gcc -funroll-loops -m64 -O3 -fomit-frame-pointer vperm
gcc -funroll-loops -m64 -O -fomit-frame-pointer vperm
gcc -funroll-loops -m64 -Os -fomit-frame-pointer vperm
gcc -funroll-loops -m64 -march=barcelona -O2 -fomit-frame-pointer vperm
gcc -funroll-loops -m64 -march=barcelona -O3 -fomit-frame-pointer vperm
gcc -funroll-loops -m64 -march=barcelona -O -fomit-frame-pointer vperm
gcc -funroll-loops -m64 -march=barcelona -Os -fomit-frame-pointer vperm
gcc -funroll-loops -m64 -march=k8 -O2 -fomit-frame-pointer vperm
gcc -funroll-loops -m64 -march=k8 -O3 -fomit-frame-pointer vperm
gcc -funroll-loops -m64 -march=k8 -O -fomit-frame-pointer vperm
gcc -funroll-loops -m64 -march=k8 -Os -fomit-frame-pointer vperm
gcc -funroll-loops -m64 -march=nocona -O2 -fomit-frame-pointer vperm
gcc -funroll-loops -m64 -march=nocona -O3 -fomit-frame-pointer vperm
gcc -funroll-loops -m64 -march=nocona -O -fomit-frame-pointer vperm
gcc -funroll-loops -m64 -march=nocona -Os -fomit-frame-pointer vperm
gcc -funroll-loops -march=barcelona -O2 -fomit-frame-pointer vperm
gcc -funroll-loops -march=barcelona -O3 -fomit-frame-pointer vperm
gcc -funroll-loops -march=barcelona -O -fomit-frame-pointer vperm
gcc -funroll-loops -march=barcelona -Os -fomit-frame-pointer vperm
gcc -funroll-loops -march=k8 -O2 -fomit-frame-pointer vperm
gcc -funroll-loops -march=k8 -O3 -fomit-frame-pointer vperm
gcc -funroll-loops -march=k8 -O -fomit-frame-pointer vperm
gcc -funroll-loops -march=k8 -Os -fomit-frame-pointer vperm
gcc -funroll-loops -march=nocona -O2 -fomit-frame-pointer vperm
gcc -funroll-loops -march=nocona -O3 -fomit-frame-pointer vperm
gcc -funroll-loops -march=nocona -O -fomit-frame-pointer vperm
gcc -funroll-loops -march=nocona -Os -fomit-frame-pointer vperm
gcc -m64 -O2 -fomit-frame-pointer vperm
gcc -m64 -O3 -fomit-frame-pointer vperm
gcc -m64 -O -fomit-frame-pointer vperm
gcc -m64 -Os -fomit-frame-pointer vperm
gcc -m64 -march=k8 -O2 -fomit-frame-pointer vperm
gcc -m64 -march=k8 -O3 -fomit-frame-pointer vperm
gcc -m64 -march=k8 -O -fomit-frame-pointer vperm
gcc -m64 -march=k8 -Os -fomit-frame-pointer vperm
gcc -m64 -march=nocona -O2 -fomit-frame-pointer vperm
gcc -m64 -march=nocona -O3 -fomit-frame-pointer vperm
gcc -m64 -march=nocona -O -fomit-frame-pointer vperm
gcc -m64 -march=nocona -Os -fomit-frame-pointer vperm
gcc -march=barcelona -O2 -fomit-frame-pointer vperm
gcc -march=barcelona -O3 -fomit-frame-pointer vperm
gcc -march=barcelona -O -fomit-frame-pointer vperm
gcc -march=barcelona -Os -fomit-frame-pointer vperm
gcc -march=k8 -O2 -fomit-frame-pointer vperm
gcc -march=k8 -O3 -fomit-frame-pointer vperm
gcc -march=k8 -O -fomit-frame-pointer vperm
gcc -march=k8 -Os -fomit-frame-pointer vperm
gcc -march=nocona -O2 -fomit-frame-pointer vperm
gcc -march=nocona -O3 -fomit-frame-pointer vperm
gcc -march=nocona -O -fomit-frame-pointer vperm
gcc -march=nocona -Os -fomit-frame-pointer vperm

Compiler output

Implementation: crypto_aead/twine80n6t4clocv2/vperm
Compiler: clang -O3 -fomit-frame-pointer -Qunused-arguments
encrypt.c: In file included from encrypt.c:3:
encrypt.c: ./twine.h:227:15: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'Encode' that is compiled without support for 'ssse3'
encrypt.c: word state = LOADS(text);
encrypt.c: ^
encrypt.c: ./twine.h:76:26: note: expanded from macro 'LOADS'
encrypt.c: #define LOADS(p) SHUFFLE4(LOAD64(p)) /* load 64-bit word from memory address p, and shuffle it */
encrypt.c: ^
encrypt.c: ./twine.h:81:3: note: expanded from macro 'SHUFFLE4'
encrypt.c: _mm_shuffle_epi8(MASK4L(x), _mm_set_epi8(7, -1, 6, -1, 5, -1, 4, -1, 3, -1, 2, -1, 1, -1, 0, -1)), \
encrypt.c: ^
encrypt.c: ./twine.h:227:15: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'Encode' that is compiled without support for 'ssse3'
encrypt.c: ./twine.h:76:26: note: expanded from macro 'LOADS'
encrypt.c: #define LOADS(p) SHUFFLE4(LOAD64(p)) /* load 64-bit word from memory address p, and shuffle it */
encrypt.c: ^
encrypt.c: ./twine.h:82:3: note: expanded from macro 'SHUFFLE4'
encrypt.c: _mm_shuffle_epi8(SHR4(MASK4U(x)), _mm_set_epi8(-1, 7, -1, 6, -1, 5, -1, 4, -1, 3, -1, 2, -1, 1, -1, 0)))
encrypt.c: ^
encrypt.c: ./twine.h:228:2: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'Encode' that is compiled without support for 'ssse3'
encrypt.c: twine80_enc(state);
encrypt.c: ^
encrypt.c: ./twine.h:163:9: note: expanded from macro 'twine80_enc'
encrypt.c: left = PSHUFB(state, left); \
encrypt.c: ^
encrypt.c: ./twine.h:70:25: note: expanded from macro 'PSHUFB'
encrypt.c: #define PSHUFB(s,x) _mm_shuffle_epi8((s), (x)) /* return s(x) */
encrypt.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
clang -O3 -fomit-frame-pointer -Qunused-arguments vperm
clang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments vperm
clang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments vperm
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments vperm

Compiler output

Implementation: crypto_aead/twine80n6t4clocv2/vperm
Compiler: gcc -m64 -march=barcelona -O2 -fomit-frame-pointer
encrypt.c: In file included from twine.h:7:0,
encrypt.c: from encrypt.c:3:
encrypt.c: twine.h: In function 'Encode':
encrypt.c: /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/tmmintrin.h:136:1: error: inlining failed in call to always_inline '_mm_shuffle_epi8': target specific option mismatch
encrypt.c: _mm_shuffle_epi8 (__m128i __X, __m128i __Y)
encrypt.c: ^~~~~~~~~~~~~~~~
encrypt.c: In file included from encrypt.c:3:0:
encrypt.c: twine.h:174:7: note: called from here
encrypt.c: _tmp = PSHUFB(state, _tmp); \
encrypt.c: ^
encrypt.c: twine.h:228:2: note: in expansion of macro 'twine80_enc'
encrypt.c: twine80_enc(state);
encrypt.c: ^~~~~~~~~~~
encrypt.c: In file included from twine.h:7:0,
encrypt.c: from encrypt.c:3:
encrypt.c: /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/tmmintrin.h:136:1: error: inlining failed in call to always_inline '_mm_shuffle_epi8': target specific option mismatch
encrypt.c: _mm_shuffle_epi8 (__m128i __X, __m128i __Y)
encrypt.c: ^~~~~~~~~~~~~~~~
encrypt.c: In file included from encrypt.c:3:0:
encrypt.c: twine.h:171:8: note: called from here
encrypt.c: right = PSHUFB(state, right); \
encrypt.c: ^
encrypt.c: twine.h:228:2: note: in expansion of macro 'twine80_enc'
encrypt.c: twine80_enc(state);
encrypt.c: ^~~~~~~~~~~
encrypt.c: ...
encrypt.c: In file included from twine.h:7:0,
encrypt.c: from encrypt.c:3:
encrypt.c: twine.h: In function 'Encode':
encrypt.c: /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/tmmintrin.h:136:1: error: inlining failed in call to always_inline '_mm_shuffle_epi8': target specific option mismatch
encrypt.c: _mm_shuffle_epi8 (__m128i __X, __m128i __Y)
encrypt.c: ^~~~~~~~~~~~~~~~
encrypt.c: In file included from encrypt.c:3:0:
encrypt.c: twine.h:174:7: note: called from here
encrypt.c: _tmp = PSHUFB(state, _tmp); \
encrypt.c: ^
encrypt.c: twine.h:228:2: note: in expansion of macro 'twine80_enc'
encrypt.c: twine80_enc(state);
encrypt.c: ^~~~~~~~~~~
encrypt.c: In file included from twine.h:7:0,
encrypt.c: from encrypt.c:3:
encrypt.c: /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/tmmintrin.h:136:1: error: inlining failed in call to always_inline '_mm_shuffle_epi8': target specific option mismatch
encrypt.c: _mm_shuffle_epi8 (__m128i __X, __m128i __Y)
encrypt.c: ^~~~~~~~~~~~~~~~
encrypt.c: In file included from encrypt.c:3:0:
encrypt.c: twine.h:171:8: note: called from here
encrypt.c: right = PSHUFB(state, right); \
encrypt.c: ^
encrypt.c: twine.h:228:2: note: in expansion of macro 'twine80_enc'
encrypt.c: twine80_enc(state);
encrypt.c: ^~~~~~~~~~~
encrypt.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m64 -march=barcelona -O2 -fomit-frame-pointer vperm
gcc -m64 -march=barcelona -O3 -fomit-frame-pointer vperm
gcc -m64 -march=barcelona -O -fomit-frame-pointer vperm
gcc -m64 -march=barcelona -Os -fomit-frame-pointer vperm