Implementation notes: amd64, skylake, crypto_aead/scream12v2

Computer: skylake
Architecture: amd64
CPU ID: GenuineIntel-000506e3-bfebfbff
SUPERCOP version: 20161026
Operation: crypto_aead
Primitive: scream12v2
TimeImplementationCompilerBenchmark dateSUPERCOP version
67830ssegcc -m64 -march=native -mtune=native -O3 -fomit-frame-pointer2016121620161026
68686ssegcc -m64 -march=core-avx2 -O3 -fomit-frame-pointer2016121620161026
68890ssegcc -m64 -march=corei7-avx -O3 -fomit-frame-pointer2016121620161026
69852ssegcc -m64 -march=core-avx-i -O3 -fomit-frame-pointer2016121620161026
70046ssegcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016121620161026
73940ssegcc -m64 -march=corei7 -O3 -fomit-frame-pointer2016121620161026
74216ssegcc -m64 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2016121620161026
74236ssegcc -m64 -march=core2 -msse4 -O3 -fomit-frame-pointer2016121620161026
74372ssegcc -m64 -march=core2 -O3 -fomit-frame-pointer2016121620161026
75242sseclang -O3 -fwrapv -march=x86-64 -mcpu=core-avx2 -mavx2 -maes -mpclmul -fomit-frame-pointer -Qunused-arguments2016121620161026
75280sseclang -O3 -fwrapv -mavx2 -fomit-frame-pointer -Qunused-arguments2016121620161026
75694ssegcc -m64 -march=corei7-avx -O2 -fomit-frame-pointer2016121620161026
75720ssegcc -m64 -march=core-avx2 -O2 -fomit-frame-pointer2016121620161026
75800ssegcc -m64 -march=native -mtune=native -O2 -fomit-frame-pointer2016121620161026
76018ssegcc -m64 -march=core-avx-i -O2 -fomit-frame-pointer2016121620161026
76052sseclang -O3 -fwrapv -mavx -fomit-frame-pointer -Qunused-arguments2016121620161026
76088sseclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121620161026
76328sseclang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments2016121620161026
76392ssegcc -m64 -march=native -mtune=native -Os -fomit-frame-pointer2016121620161026
76440ssegcc -m64 -march=core-avx2 -Os -fomit-frame-pointer2016121620161026
76460ssegcc -m64 -march=corei7-avx -Os -fomit-frame-pointer2016121620161026
76586ssegcc -m64 -march=core-avx-i -Os -fomit-frame-pointer2016121620161026
77916sseclang -O3 -fwrapv -mavx -maes -mpclmul -fomit-frame-pointer -Qunused-arguments2016121620161026
78762ssegcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016121620161026
80830ssegcc -m64 -march=corei7 -O2 -fomit-frame-pointer2016121620161026
81018ssegcc -m64 -march=core2 -O2 -fomit-frame-pointer2016121620161026
81198ssegcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016121620161026
81516ssegcc -m64 -march=corei7-avx -O -fomit-frame-pointer2016121620161026
81572ssegcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016121620161026
81634ssegcc -m64 -march=native -mtune=native -O -fomit-frame-pointer2016121620161026
81760ssegcc -m64 -march=core-avx-i -O -fomit-frame-pointer2016121620161026
81902ssegcc -m64 -march=core2 -msse4 -O2 -fomit-frame-pointer2016121620161026
81916ssegcc -m64 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2016121620161026
82802ssegcc -m64 -march=core-avx2 -O -fomit-frame-pointer2016121620161026
87822ssegcc -m64 -march=core2 -msse4 -O -fomit-frame-pointer2016121620161026
88136ssegcc -m64 -march=core2 -msse4.1 -O -fomit-frame-pointer2016121620161026
88242ssegcc -m64 -march=core2 -O -fomit-frame-pointer2016121620161026
89848ssegcc -m64 -march=corei7 -O -fomit-frame-pointer2016121620161026
119978ssegcc -m64 -march=corei7 -Os -fomit-frame-pointer2016121620161026
120536ssegcc -m64 -march=core2 -Os -fomit-frame-pointer2016121620161026
120548ssegcc -m64 -march=core2 -msse4 -Os -fomit-frame-pointer2016121620161026
120698ssegcc -m64 -march=core2 -msse4.1 -Os -fomit-frame-pointer2016121620161026
316024refgcc -funroll-loops -march=nocona -O3 -fomit-frame-pointer2016121620161026
316446refgcc -march=k8 -O3 -fomit-frame-pointer2016121620161026
316544refgcc -m64 -march=nocona -O3 -fomit-frame-pointer2016121620161026
316556refgcc -m64 -march=core2 -msse4 -O3 -fomit-frame-pointer2016121620161026
316666refgcc -m64 -march=core2 -O3 -fomit-frame-pointer2016121620161026
316704refgcc -funroll-loops -m64 -march=nocona -O3 -fomit-frame-pointer2016121620161026
316856refgcc -march=barcelona -O3 -fomit-frame-pointer2016121620161026
317000refgcc -m64 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2016121620161026
317070refgcc -m64 -march=corei7 -O3 -fomit-frame-pointer2016121620161026
317286refgcc -m64 -march=core-avx2 -O3 -fomit-frame-pointer2016121620161026
317320refgcc -m64 -march=barcelona -O3 -fomit-frame-pointer2016121620161026
317382refgcc -m64 -march=native -mtune=native -O3 -fomit-frame-pointer2016121620161026
317860refgcc -m64 -march=core-avx-i -O3 -fomit-frame-pointer2016121620161026
318214refgcc -m64 -march=k8 -O3 -fomit-frame-pointer2016121620161026
318738refgcc -march=nocona -O3 -fomit-frame-pointer2016121620161026
319818refgcc -m64 -march=corei7-avx -O3 -fomit-frame-pointer2016121620161026
320088refgcc -funroll-loops -march=k8 -O3 -fomit-frame-pointer2016121620161026
321078refgcc -funroll-loops -m64 -march=k8 -O3 -fomit-frame-pointer2016121620161026
322022refgcc -funroll-loops -march=k8 -O2 -fomit-frame-pointer2016121620161026
322572refgcc -funroll-loops -m64 -march=barcelona -O2 -fomit-frame-pointer2016121620161026
322744refgcc -funroll-loops -march=barcelona -O2 -fomit-frame-pointer2016121620161026
322866refgcc -funroll-loops -O2 -fomit-frame-pointer2016121620161026
322868refgcc -funroll-loops -m64 -O2 -fomit-frame-pointer2016121620161026
323088refgcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer2016121620161026
323174refgcc -funroll-loops -m64 -march=k8 -O2 -fomit-frame-pointer2016121620161026
323638refgcc -funroll-loops -march=barcelona -O3 -fomit-frame-pointer2016121620161026
323722refgcc -funroll-loops -m64 -march=barcelona -O3 -fomit-frame-pointer2016121620161026
324000refgcc -funroll-loops -march=nocona -O2 -fomit-frame-pointer2016121620161026
324352refgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016121620161026
325122refgcc -funroll-loops -m64 -march=nocona -O2 -fomit-frame-pointer2016121620161026
333330refgcc -funroll-loops -m64 -O3 -fomit-frame-pointer2016121620161026
333636refgcc -O3 -fomit-frame-pointer2016121620161026
334124refgcc -fno-schedule-insns -O3 -fomit-frame-pointer2016121620161026
334418refgcc -m64 -O3 -fomit-frame-pointer2016121620161026
334602refgcc -funroll-loops -m64 -march=nocona -O -fomit-frame-pointer2016121620161026
335808refgcc -funroll-loops -O3 -fomit-frame-pointer2016121620161026
335896refgcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer2016121620161026
336044refgcc -funroll-loops -march=nocona -O -fomit-frame-pointer2016121620161026
337536refgcc -funroll-loops -m64 -O -fomit-frame-pointer2016121620161026
337664refgcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer2016121620161026
338548refgcc -funroll-loops -march=barcelona -O -fomit-frame-pointer2016121620161026
338632refgcc -funroll-loops -m64 -march=barcelona -O -fomit-frame-pointer2016121620161026
338916refgcc -funroll-loops -march=k8 -O -fomit-frame-pointer2016121620161026
339330refgcc -funroll-loops -m64 -march=k8 -O -fomit-frame-pointer2016121620161026
339494refgcc -funroll-loops -O -fomit-frame-pointer2016121620161026
496948refclang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments2016121620161026
497402refclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121620161026
508754refgcc -m64 -march=k8 -O2 -fomit-frame-pointer2016121620161026
508864refgcc -march=k8 -O2 -fomit-frame-pointer2016121620161026
509386refgcc -march=nocona -O2 -fomit-frame-pointer2016121620161026
509458refgcc -fno-schedule-insns -O2 -fomit-frame-pointer2016121620161026
509678refgcc -m64 -O2 -fomit-frame-pointer2016121620161026
510078refgcc -m64 -march=barcelona -O2 -fomit-frame-pointer2016121620161026
511370refgcc -m64 -march=core2 -O2 -fomit-frame-pointer2016121620161026
511420refgcc -march=barcelona -O2 -fomit-frame-pointer2016121620161026
511814refgcc -m64 -march=nocona -O2 -fomit-frame-pointer2016121620161026
512252refgcc -m64 -march=corei7-avx -O2 -fomit-frame-pointer2016121620161026
512314refgcc -m64 -march=core-avx-i -O2 -fomit-frame-pointer2016121620161026
512636refgcc -m64 -march=core2 -msse4 -O2 -fomit-frame-pointer2016121620161026
512868refgcc -O2 -fomit-frame-pointer2016121620161026
513380refgcc -m64 -march=core-avx2 -O2 -fomit-frame-pointer2016121620161026
513724refclang -O3 -fwrapv -mavx2 -fomit-frame-pointer -Qunused-arguments2016121620161026
513730refgcc -m64 -march=native -mtune=native -O2 -fomit-frame-pointer2016121620161026
513948refclang -O3 -fwrapv -mavx -maes -mpclmul -fomit-frame-pointer -Qunused-arguments2016121620161026
514138refgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016121620161026
514336refclang -O3 -fwrapv -march=x86-64 -mcpu=core-avx2 -mavx2 -maes -mpclmul -fomit-frame-pointer -Qunused-arguments2016121620161026
514434refgcc -m64 -march=corei7 -O2 -fomit-frame-pointer2016121620161026
514970refclang -O3 -fwrapv -mavx -fomit-frame-pointer -Qunused-arguments2016121620161026
515924refgcc -m64 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2016121620161026
526186refgcc -m64 -march=nocona -O -fomit-frame-pointer2016121620161026
528606refgcc -march=nocona -O -fomit-frame-pointer2016121620161026
532524refgcc -m64 -march=core-avx2 -O -fomit-frame-pointer2016121620161026
533330refgcc -O -fomit-frame-pointer2016121620161026
533912refgcc -march=k8 -O -fomit-frame-pointer2016121620161026
535068refgcc -m64 -O -fomit-frame-pointer2016121620161026
535234refgcc -m64 -march=native -mtune=native -O -fomit-frame-pointer2016121620161026
535320refgcc -m64 -march=k8 -O -fomit-frame-pointer2016121620161026
535554refgcc -m64 -march=core-avx-i -O -fomit-frame-pointer2016121620161026
535608refgcc -m64 -march=core2 -msse4.1 -O -fomit-frame-pointer2016121620161026
536160refgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016121620161026
536238refgcc -m64 -march=core2 -O -fomit-frame-pointer2016121620161026
536378refgcc -m64 -march=core2 -msse4 -O -fomit-frame-pointer2016121620161026
536694refgcc -m64 -march=corei7-avx -O -fomit-frame-pointer2016121620161026
537748refgcc -m64 -march=corei7 -O -fomit-frame-pointer2016121620161026
538106refgcc -m64 -march=barcelona -O -fomit-frame-pointer2016121620161026
539256refgcc -march=barcelona -O -fomit-frame-pointer2016121620161026
541646refgcc -fno-schedule-insns -O -fomit-frame-pointer2016121620161026
547800refclang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121620161026
548640refclang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121620161026
549480refclang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121620161026
549638refclang -O3 -fomit-frame-pointer -Qunused-arguments2016121620161026
767776refgcc -fno-schedule-insns -Os -fomit-frame-pointer2016121620161026
768142refgcc -march=nocona -Os -fomit-frame-pointer2016121620161026
768170refgcc -m64 -Os -fomit-frame-pointer2016121620161026
768422refgcc -march=k8 -Os -fomit-frame-pointer2016121620161026
770662refgcc -march=barcelona -Os -fomit-frame-pointer2016121620161026
771038refgcc -Os -fomit-frame-pointer2016121620161026
772010refgcc -m64 -march=k8 -Os -fomit-frame-pointer2016121620161026
772182refgcc -m64 -march=corei7-avx -Os -fomit-frame-pointer2016121620161026
773632refgcc -m64 -march=core2 -msse4.1 -Os -fomit-frame-pointer2016121620161026
774340refgcc -m64 -march=corei7 -Os -fomit-frame-pointer2016121620161026
774548refgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016121620161026
774606refgcc -m64 -march=nocona -Os -fomit-frame-pointer2016121620161026
774628refgcc -m64 -march=barcelona -Os -fomit-frame-pointer2016121620161026
774986refgcc -m64 -march=native -mtune=native -Os -fomit-frame-pointer2016121620161026
775092refgcc -m64 -march=core2 -msse4 -Os -fomit-frame-pointer2016121620161026
775418refgcc -m64 -march=core-avx2 -Os -fomit-frame-pointer2016121620161026
777982refgcc -m64 -march=core-avx-i -Os -fomit-frame-pointer2016121620161026
780074refgcc -m64 -march=core2 -Os -fomit-frame-pointer2016121620161026
919376refgcc -funroll-loops -m64 -Os -fomit-frame-pointer2016121620161026
919478refgcc -funroll-loops -Os -fomit-frame-pointer2016121620161026
920272refgcc -funroll-loops -m64 -march=k8 -Os -fomit-frame-pointer2016121620161026
920360refgcc -funroll-loops -march=nocona -Os -fomit-frame-pointer2016121620161026
920634refgcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer2016121620161026
921134refgcc -funroll-loops -march=k8 -Os -fomit-frame-pointer2016121620161026
922708refgcc -funroll-loops -m64 -march=nocona -Os -fomit-frame-pointer2016121620161026
923832refgcc -funroll-loops -m64 -march=barcelona -Os -fomit-frame-pointer2016121620161026
929718refgcc -funroll-loops -march=barcelona -Os -fomit-frame-pointer2016121620161026
1789692refgcc2016121620161026
1791824refgcc -funroll-loops2016121620161026
1797236refcc2016121620161026

Compiler output

Implementation: crypto_aead/scream12v2/sse
Compiler: cc
scream.c: scream.c: In function 'LBox16P':
scream.c: scream.c:185:10: warning: implicit declaration of function '__builtin_ia32_pshufb128' [-Wimplicit-function-declaration]
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^~~~~~~~~~~~~~~~~~~~~~~~
scream.c: scream.c:185:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:186:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: C = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:190:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: B = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:191:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: D = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:198:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: A ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: scream.c:199:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: C ^= __builtin_ia32_pshufb128(table, in[2]);
scream.c: ^~
scream.c: scream.c:203:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: B ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: ...

Number of similar (compiler,implementation) pairs: 71, namely:
CompilerImplementations
cc sse
gcc sse
gcc -O2 -fomit-frame-pointer sse
gcc -O3 -fomit-frame-pointer sse
gcc -O -fomit-frame-pointer sse
gcc -Os -fomit-frame-pointer sse
gcc -fno-schedule-insns -O2 -fomit-frame-pointer sse
gcc -fno-schedule-insns -O3 -fomit-frame-pointer sse
gcc -fno-schedule-insns -O -fomit-frame-pointer sse
gcc -fno-schedule-insns -Os -fomit-frame-pointer sse
gcc -funroll-loops sse
gcc -funroll-loops -O2 -fomit-frame-pointer sse
gcc -funroll-loops -O3 -fomit-frame-pointer sse
gcc -funroll-loops -O -fomit-frame-pointer sse
gcc -funroll-loops -Os -fomit-frame-pointer sse
gcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer sse
gcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer sse
gcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer sse
gcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer sse
gcc -funroll-loops -m64 -O2 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -O3 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -O -fomit-frame-pointer sse
gcc -funroll-loops -m64 -Os -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=barcelona -O2 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=barcelona -O3 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=barcelona -O -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=barcelona -Os -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=k8 -O2 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=k8 -O3 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=k8 -O -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=k8 -Os -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=nocona -O2 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=nocona -O3 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=nocona -O -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=nocona -Os -fomit-frame-pointer sse
gcc -funroll-loops -march=barcelona -O2 -fomit-frame-pointer sse
gcc -funroll-loops -march=barcelona -O3 -fomit-frame-pointer sse
gcc -funroll-loops -march=barcelona -O -fomit-frame-pointer sse
gcc -funroll-loops -march=barcelona -Os -fomit-frame-pointer sse
gcc -funroll-loops -march=k8 -O2 -fomit-frame-pointer sse
gcc -funroll-loops -march=k8 -O3 -fomit-frame-pointer sse
gcc -funroll-loops -march=k8 -O -fomit-frame-pointer sse
gcc -funroll-loops -march=k8 -Os -fomit-frame-pointer sse
gcc -funroll-loops -march=nocona -O2 -fomit-frame-pointer sse
gcc -funroll-loops -march=nocona -O3 -fomit-frame-pointer sse
gcc -funroll-loops -march=nocona -O -fomit-frame-pointer sse
gcc -funroll-loops -march=nocona -Os -fomit-frame-pointer sse
gcc -m64 -O2 -fomit-frame-pointer sse
gcc -m64 -O3 -fomit-frame-pointer sse
gcc -m64 -O -fomit-frame-pointer sse
gcc -m64 -Os -fomit-frame-pointer sse
gcc -m64 -march=k8 -O2 -fomit-frame-pointer sse
gcc -m64 -march=k8 -O3 -fomit-frame-pointer sse
gcc -m64 -march=k8 -O -fomit-frame-pointer sse
gcc -m64 -march=k8 -Os -fomit-frame-pointer sse
gcc -m64 -march=nocona -O2 -fomit-frame-pointer sse
gcc -m64 -march=nocona -O3 -fomit-frame-pointer sse
gcc -m64 -march=nocona -O -fomit-frame-pointer sse
gcc -m64 -march=nocona -Os -fomit-frame-pointer sse
gcc -march=barcelona -O2 -fomit-frame-pointer sse
gcc -march=barcelona -O3 -fomit-frame-pointer sse
gcc -march=barcelona -O -fomit-frame-pointer sse
gcc -march=barcelona -Os -fomit-frame-pointer sse
gcc -march=k8 -O2 -fomit-frame-pointer sse
gcc -march=k8 -O3 -fomit-frame-pointer sse
gcc -march=k8 -O -fomit-frame-pointer sse
gcc -march=k8 -Os -fomit-frame-pointer sse
gcc -march=nocona -O2 -fomit-frame-pointer sse
gcc -march=nocona -O3 -fomit-frame-pointer sse
gcc -march=nocona -O -fomit-frame-pointer sse
gcc -march=nocona -Os -fomit-frame-pointer sse

Compiler output

Implementation: crypto_aead/scream12v2/sse
Compiler: clang -O3 -fomit-frame-pointer -Qunused-arguments
scream.c: scream.c:185:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:186:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: C = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:190:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: B = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:191:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: D = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:198:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: A ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^
scream.c: scream.c:199:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: C ^= __builtin_ia32_pshufb128(table, in[2]);
scream.c: ^
scream.c: scream.c:203:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: B ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^
scream.c: scream.c:204:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: D ^= __builtin_ia32_pshufb128(table, in[2]);
scream.c: ^
scream.c: scream.c:211:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
clang -O3 -fomit-frame-pointer -Qunused-arguments sse
clang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments sse
clang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments sse
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments sse

Compiler output

Implementation: crypto_aead/scream12v2/sse
Compiler: gcc -m64 -march=barcelona -O2 -fomit-frame-pointer
scream.c: scream.c: In function 'LBox16P':
scream.c: scream.c:185:10: warning: implicit declaration of function '__builtin_ia32_pshufb128' [-Wimplicit-function-declaration]
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^~~~~~~~~~~~~~~~~~~~~~~~
scream.c: scream.c:185:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:186:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: C = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:190:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: B = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:191:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: D = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:198:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: A ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: scream.c:199:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: C ^= __builtin_ia32_pshufb128(table, in[2]);
scream.c: ^~
scream.c: scream.c:203:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: B ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: ...
scream.c: scream.c: In function 'LBox16P':
scream.c: scream.c:185:10: warning: implicit declaration of function '__builtin_ia32_pshufb128' [-Wimplicit-function-declaration]
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^~~~~~~~~~~~~~~~~~~~~~~~
scream.c: scream.c:185:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:186:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: C = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:190:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: B = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:191:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: D = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:198:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: A ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: scream.c:199:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: C ^= __builtin_ia32_pshufb128(table, in[2]);
scream.c: ^~
scream.c: scream.c:203:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: B ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m64 -march=barcelona -O2 -fomit-frame-pointer sse
gcc -m64 -march=barcelona -O3 -fomit-frame-pointer sse
gcc -m64 -march=barcelona -O -fomit-frame-pointer sse
gcc -m64 -march=barcelona -Os -fomit-frame-pointer sse