Implementation notes: amd64, skylake, crypto_aead/scream10v2

Computer: skylake
Architecture: amd64
CPU ID: GenuineIntel-000506e3-bfebfbff
SUPERCOP version: 20161026
Operation: crypto_aead
Primitive: scream10v2
TimeImplementationCompilerBenchmark dateSUPERCOP version
57690ssegcc -m64 -march=native -mtune=native -O3 -fomit-frame-pointer2016121620161026
57792ssegcc -m64 -march=core-avx-i -O3 -fomit-frame-pointer2016121620161026
58100ssegcc -m64 -march=corei7-avx -O3 -fomit-frame-pointer2016121620161026
58514ssegcc -m64 -march=core-avx2 -O3 -fomit-frame-pointer2016121620161026
59888ssegcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016121620161026
62944ssegcc -m64 -march=corei7 -O3 -fomit-frame-pointer2016121620161026
63206ssegcc -m64 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2016121620161026
63478ssegcc -m64 -march=core2 -O3 -fomit-frame-pointer2016121620161026
63740sseclang -O3 -fwrapv -march=x86-64 -mcpu=core-avx2 -mavx2 -maes -mpclmul -fomit-frame-pointer -Qunused-arguments2016121620161026
63772sseclang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments2016121620161026
63870sseclang -O3 -fwrapv -mavx -fomit-frame-pointer -Qunused-arguments2016121620161026
64586ssegcc -m64 -march=native -mtune=native -O2 -fomit-frame-pointer2016121620161026
64656sseclang -O3 -fwrapv -mavx2 -fomit-frame-pointer -Qunused-arguments2016121620161026
64810ssegcc -m64 -march=core-avx2 -O2 -fomit-frame-pointer2016121620161026
64822ssegcc -m64 -march=core2 -msse4 -O3 -fomit-frame-pointer2016121620161026
64932ssegcc -m64 -march=corei7-avx -O2 -fomit-frame-pointer2016121620161026
65008ssegcc -m64 -march=native -mtune=native -Os -fomit-frame-pointer2016121620161026
65154ssegcc -m64 -march=core-avx-i -Os -fomit-frame-pointer2016121620161026
65196ssegcc -m64 -march=corei7-avx -Os -fomit-frame-pointer2016121620161026
65348ssegcc -m64 -march=core-avx2 -Os -fomit-frame-pointer2016121620161026
65658ssegcc -m64 -march=core-avx-i -O2 -fomit-frame-pointer2016121620161026
66788ssegcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016121620161026
68092ssegcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016121620161026
68602ssegcc -m64 -march=core2 -O2 -fomit-frame-pointer2016121620161026
68714ssegcc -m64 -march=corei7 -O2 -fomit-frame-pointer2016121620161026
68874ssegcc -m64 -march=core2 -msse4 -O2 -fomit-frame-pointer2016121620161026
69330ssegcc -m64 -march=core-avx2 -O -fomit-frame-pointer2016121620161026
69394ssegcc -m64 -march=corei7-avx -O -fomit-frame-pointer2016121620161026
69540ssegcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016121620161026
69556ssegcc -m64 -march=core-avx-i -O -fomit-frame-pointer2016121620161026
70334ssegcc -m64 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2016121620161026
72124ssegcc -m64 -march=native -mtune=native -O -fomit-frame-pointer2016121620161026
72678sseclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121620161026
74796ssegcc -m64 -march=core2 -msse4 -O -fomit-frame-pointer2016121620161026
75016ssegcc -m64 -march=core2 -msse4.1 -O -fomit-frame-pointer2016121620161026
75512ssegcc -m64 -march=core2 -O -fomit-frame-pointer2016121620161026
76020ssegcc -m64 -march=corei7 -O -fomit-frame-pointer2016121620161026
78066sseclang -O3 -fwrapv -mavx -maes -mpclmul -fomit-frame-pointer -Qunused-arguments2016121620161026
101570ssegcc -m64 -march=corei7 -Os -fomit-frame-pointer2016121620161026
102068ssegcc -m64 -march=core2 -msse4 -Os -fomit-frame-pointer2016121620161026
102168ssegcc -m64 -march=core2 -Os -fomit-frame-pointer2016121620161026
102234ssegcc -m64 -march=core2 -msse4.1 -Os -fomit-frame-pointer2016121620161026
270502refgcc -funroll-loops -march=nocona -O3 -fomit-frame-pointer2016121620161026
270600refgcc -funroll-loops -m64 -march=nocona -O3 -fomit-frame-pointer2016121620161026
270744refgcc -m64 -march=barcelona -O3 -fomit-frame-pointer2016121620161026
270900refgcc -march=barcelona -O3 -fomit-frame-pointer2016121620161026
271032refgcc -m64 -march=core2 -msse4 -O3 -fomit-frame-pointer2016121620161026
271162refgcc -m64 -march=core2 -O3 -fomit-frame-pointer2016121620161026
271266refgcc -m64 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2016121620161026
271318refgcc -m64 -march=core-avx2 -O3 -fomit-frame-pointer2016121620161026
271522refgcc -march=k8 -O3 -fomit-frame-pointer2016121620161026
271562refgcc -march=nocona -O3 -fomit-frame-pointer2016121620161026
271612refgcc -m64 -march=native -mtune=native -O3 -fomit-frame-pointer2016121620161026
271948refgcc -m64 -march=k8 -O3 -fomit-frame-pointer2016121620161026
272400refgcc -m64 -march=nocona -O3 -fomit-frame-pointer2016121620161026
272514refgcc -m64 -march=corei7-avx -O3 -fomit-frame-pointer2016121620161026
272850refgcc -m64 -march=corei7 -O3 -fomit-frame-pointer2016121620161026
273612refgcc -m64 -march=core-avx-i -O3 -fomit-frame-pointer2016121620161026
274208refgcc -funroll-loops -march=k8 -O3 -fomit-frame-pointer2016121620161026
274322refgcc -funroll-loops -m64 -march=k8 -O3 -fomit-frame-pointer2016121620161026
275726refgcc -funroll-loops -m64 -march=barcelona -O3 -fomit-frame-pointer2016121620161026
275954refgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016121620161026
276504refgcc -funroll-loops -march=barcelona -O3 -fomit-frame-pointer2016121620161026
276852refgcc -funroll-loops -O2 -fomit-frame-pointer2016121620161026
276894refgcc -funroll-loops -m64 -march=k8 -O2 -fomit-frame-pointer2016121620161026
276900refgcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer2016121620161026
276956refgcc -funroll-loops -m64 -march=barcelona -O2 -fomit-frame-pointer2016121620161026
277008refgcc -funroll-loops -march=k8 -O2 -fomit-frame-pointer2016121620161026
277378refgcc -funroll-loops -m64 -O2 -fomit-frame-pointer2016121620161026
277500refgcc -funroll-loops -march=barcelona -O2 -fomit-frame-pointer2016121620161026
277934refgcc -funroll-loops -march=nocona -O2 -fomit-frame-pointer2016121620161026
278748refgcc -funroll-loops -m64 -march=nocona -O2 -fomit-frame-pointer2016121620161026
287430refgcc -funroll-loops -m64 -march=nocona -O -fomit-frame-pointer2016121620161026
288422refgcc -funroll-loops -march=nocona -O -fomit-frame-pointer2016121620161026
288722refgcc -funroll-loops -m64 -O3 -fomit-frame-pointer2016121620161026
288808refgcc -m64 -O3 -fomit-frame-pointer2016121620161026
289134refgcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer2016121620161026
289640refgcc -funroll-loops -m64 -O -fomit-frame-pointer2016121620161026
289926refgcc -funroll-loops -O -fomit-frame-pointer2016121620161026
290032refgcc -funroll-loops -m64 -march=barcelona -O -fomit-frame-pointer2016121620161026
290366refgcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer2016121620161026
290922refgcc -funroll-loops -march=barcelona -O -fomit-frame-pointer2016121620161026
291674refgcc -funroll-loops -march=k8 -O -fomit-frame-pointer2016121620161026
292056refgcc -fno-schedule-insns -O3 -fomit-frame-pointer2016121620161026
292370refgcc -O3 -fomit-frame-pointer2016121620161026
292712refgcc -funroll-loops -O3 -fomit-frame-pointer2016121620161026
292912refgcc -funroll-loops -m64 -march=k8 -O -fomit-frame-pointer2016121620161026
420332refclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121620161026
420348refclang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments2016121620161026
431768refgcc -march=k8 -O2 -fomit-frame-pointer2016121620161026
432470refgcc -march=barcelona -O2 -fomit-frame-pointer2016121620161026
432542refgcc -fno-schedule-insns -O2 -fomit-frame-pointer2016121620161026
432578refgcc -m64 -march=nocona -O2 -fomit-frame-pointer2016121620161026
432858refgcc -m64 -march=corei7 -O2 -fomit-frame-pointer2016121620161026
432940refgcc -O2 -fomit-frame-pointer2016121620161026
433150refgcc -march=nocona -O2 -fomit-frame-pointer2016121620161026
433250refgcc -m64 -march=barcelona -O2 -fomit-frame-pointer2016121620161026
433512refgcc -m64 -march=corei7-avx -O2 -fomit-frame-pointer2016121620161026
433570refgcc -m64 -march=core2 -O2 -fomit-frame-pointer2016121620161026
433596refgcc -m64 -march=k8 -O2 -fomit-frame-pointer2016121620161026
434212refgcc -m64 -march=core2 -msse4 -O2 -fomit-frame-pointer2016121620161026
434404refclang -O3 -fwrapv -mavx -fomit-frame-pointer -Qunused-arguments2016121620161026
434432refgcc -m64 -march=core-avx-i -O2 -fomit-frame-pointer2016121620161026
434666refclang -O3 -fwrapv -mavx2 -fomit-frame-pointer -Qunused-arguments2016121620161026
435080refclang -O3 -fwrapv -mavx -maes -mpclmul -fomit-frame-pointer -Qunused-arguments2016121620161026
435112refclang -O3 -fwrapv -march=x86-64 -mcpu=core-avx2 -mavx2 -maes -mpclmul -fomit-frame-pointer -Qunused-arguments2016121620161026
435200refgcc -m64 -O2 -fomit-frame-pointer2016121620161026
435574refgcc -m64 -march=core-avx2 -O2 -fomit-frame-pointer2016121620161026
435780refgcc -m64 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2016121620161026
439150refgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016121620161026
439348refgcc -m64 -march=native -mtune=native -O2 -fomit-frame-pointer2016121620161026
446950refgcc -m64 -march=nocona -O -fomit-frame-pointer2016121620161026
448738refgcc -march=nocona -O -fomit-frame-pointer2016121620161026
451646refgcc -m64 -march=core-avx2 -O -fomit-frame-pointer2016121620161026
452846refgcc -O -fomit-frame-pointer2016121620161026
453086refgcc -m64 -march=k8 -O -fomit-frame-pointer2016121620161026
453740refgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016121620161026
454162refgcc -m64 -march=core2 -msse4 -O -fomit-frame-pointer2016121620161026
454344refgcc -m64 -march=corei7-avx -O -fomit-frame-pointer2016121620161026
454524refgcc -march=k8 -O -fomit-frame-pointer2016121620161026
454748refgcc -m64 -march=core2 -O -fomit-frame-pointer2016121620161026
455100refgcc -m64 -O -fomit-frame-pointer2016121620161026
455212refgcc -fno-schedule-insns -O -fomit-frame-pointer2016121620161026
455374refgcc -m64 -march=corei7 -O -fomit-frame-pointer2016121620161026
455516refgcc -m64 -march=core2 -msse4.1 -O -fomit-frame-pointer2016121620161026
455658refgcc -m64 -march=barcelona -O -fomit-frame-pointer2016121620161026
455730refgcc -m64 -march=core-avx-i -O -fomit-frame-pointer2016121620161026
456868refgcc -march=barcelona -O -fomit-frame-pointer2016121620161026
457186refgcc -m64 -march=native -mtune=native -O -fomit-frame-pointer2016121620161026
463070refclang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121620161026
463424refclang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121620161026
466946refclang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121620161026
473978refclang -O3 -fomit-frame-pointer -Qunused-arguments2016121620161026
646780refgcc -march=k8 -Os -fomit-frame-pointer2016121620161026
646898refgcc -march=nocona -Os -fomit-frame-pointer2016121620161026
647436refgcc -march=barcelona -Os -fomit-frame-pointer2016121620161026
647814refgcc -fno-schedule-insns -Os -fomit-frame-pointer2016121620161026
647880refgcc -m64 -Os -fomit-frame-pointer2016121620161026
649102refgcc -m64 -march=native -mtune=native -Os -fomit-frame-pointer2016121620161026
649978refgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016121620161026
650278refgcc -m64 -march=corei7-avx -Os -fomit-frame-pointer2016121620161026
651108refgcc -m64 -march=corei7 -Os -fomit-frame-pointer2016121620161026
651846refgcc -Os -fomit-frame-pointer2016121620161026
652152refgcc -m64 -march=k8 -Os -fomit-frame-pointer2016121620161026
652704refgcc -m64 -march=core2 -msse4 -Os -fomit-frame-pointer2016121620161026
652750refgcc -m64 -march=nocona -Os -fomit-frame-pointer2016121620161026
653664refgcc -m64 -march=core2 -msse4.1 -Os -fomit-frame-pointer2016121620161026
654000refgcc -m64 -march=core-avx-i -Os -fomit-frame-pointer2016121620161026
655204refgcc -m64 -march=core2 -Os -fomit-frame-pointer2016121620161026
655438refgcc -m64 -march=barcelona -Os -fomit-frame-pointer2016121620161026
655762refgcc -m64 -march=core-avx2 -Os -fomit-frame-pointer2016121620161026
773492refgcc -funroll-loops -march=k8 -Os -fomit-frame-pointer2016121620161026
773808refgcc -funroll-loops -Os -fomit-frame-pointer2016121620161026
774722refgcc -funroll-loops -m64 -march=nocona -Os -fomit-frame-pointer2016121620161026
774896refgcc -funroll-loops -march=barcelona -Os -fomit-frame-pointer2016121620161026
775258refgcc -funroll-loops -m64 -march=barcelona -Os -fomit-frame-pointer2016121620161026
775660refgcc -funroll-loops -march=nocona -Os -fomit-frame-pointer2016121620161026
776666refgcc -funroll-loops -m64 -march=k8 -Os -fomit-frame-pointer2016121620161026
778186refgcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer2016121620161026
780422refgcc -funroll-loops -m64 -Os -fomit-frame-pointer2016121620161026
1519340refcc2016121620161026
1524550refgcc2016121620161026
1530144refgcc -funroll-loops2016121620161026

Compiler output

Implementation: crypto_aead/scream10v2/sse
Compiler: cc
scream.c: scream.c: In function 'LBox16P':
scream.c: scream.c:185:10: warning: implicit declaration of function '__builtin_ia32_pshufb128' [-Wimplicit-function-declaration]
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^~~~~~~~~~~~~~~~~~~~~~~~
scream.c: scream.c:185:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:186:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: C = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:190:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: B = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:191:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: D = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:198:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: A ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: scream.c:199:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: C ^= __builtin_ia32_pshufb128(table, in[2]);
scream.c: ^~
scream.c: scream.c:203:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: B ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: ...

Number of similar (compiler,implementation) pairs: 71, namely:
CompilerImplementations
cc sse
gcc sse
gcc -O2 -fomit-frame-pointer sse
gcc -O3 -fomit-frame-pointer sse
gcc -O -fomit-frame-pointer sse
gcc -Os -fomit-frame-pointer sse
gcc -fno-schedule-insns -O2 -fomit-frame-pointer sse
gcc -fno-schedule-insns -O3 -fomit-frame-pointer sse
gcc -fno-schedule-insns -O -fomit-frame-pointer sse
gcc -fno-schedule-insns -Os -fomit-frame-pointer sse
gcc -funroll-loops sse
gcc -funroll-loops -O2 -fomit-frame-pointer sse
gcc -funroll-loops -O3 -fomit-frame-pointer sse
gcc -funroll-loops -O -fomit-frame-pointer sse
gcc -funroll-loops -Os -fomit-frame-pointer sse
gcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer sse
gcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer sse
gcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer sse
gcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer sse
gcc -funroll-loops -m64 -O2 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -O3 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -O -fomit-frame-pointer sse
gcc -funroll-loops -m64 -Os -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=barcelona -O2 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=barcelona -O3 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=barcelona -O -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=barcelona -Os -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=k8 -O2 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=k8 -O3 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=k8 -O -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=k8 -Os -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=nocona -O2 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=nocona -O3 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=nocona -O -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=nocona -Os -fomit-frame-pointer sse
gcc -funroll-loops -march=barcelona -O2 -fomit-frame-pointer sse
gcc -funroll-loops -march=barcelona -O3 -fomit-frame-pointer sse
gcc -funroll-loops -march=barcelona -O -fomit-frame-pointer sse
gcc -funroll-loops -march=barcelona -Os -fomit-frame-pointer sse
gcc -funroll-loops -march=k8 -O2 -fomit-frame-pointer sse
gcc -funroll-loops -march=k8 -O3 -fomit-frame-pointer sse
gcc -funroll-loops -march=k8 -O -fomit-frame-pointer sse
gcc -funroll-loops -march=k8 -Os -fomit-frame-pointer sse
gcc -funroll-loops -march=nocona -O2 -fomit-frame-pointer sse
gcc -funroll-loops -march=nocona -O3 -fomit-frame-pointer sse
gcc -funroll-loops -march=nocona -O -fomit-frame-pointer sse
gcc -funroll-loops -march=nocona -Os -fomit-frame-pointer sse
gcc -m64 -O2 -fomit-frame-pointer sse
gcc -m64 -O3 -fomit-frame-pointer sse
gcc -m64 -O -fomit-frame-pointer sse
gcc -m64 -Os -fomit-frame-pointer sse
gcc -m64 -march=k8 -O2 -fomit-frame-pointer sse
gcc -m64 -march=k8 -O3 -fomit-frame-pointer sse
gcc -m64 -march=k8 -O -fomit-frame-pointer sse
gcc -m64 -march=k8 -Os -fomit-frame-pointer sse
gcc -m64 -march=nocona -O2 -fomit-frame-pointer sse
gcc -m64 -march=nocona -O3 -fomit-frame-pointer sse
gcc -m64 -march=nocona -O -fomit-frame-pointer sse
gcc -m64 -march=nocona -Os -fomit-frame-pointer sse
gcc -march=barcelona -O2 -fomit-frame-pointer sse
gcc -march=barcelona -O3 -fomit-frame-pointer sse
gcc -march=barcelona -O -fomit-frame-pointer sse
gcc -march=barcelona -Os -fomit-frame-pointer sse
gcc -march=k8 -O2 -fomit-frame-pointer sse
gcc -march=k8 -O3 -fomit-frame-pointer sse
gcc -march=k8 -O -fomit-frame-pointer sse
gcc -march=k8 -Os -fomit-frame-pointer sse
gcc -march=nocona -O2 -fomit-frame-pointer sse
gcc -march=nocona -O3 -fomit-frame-pointer sse
gcc -march=nocona -O -fomit-frame-pointer sse
gcc -march=nocona -Os -fomit-frame-pointer sse

Compiler output

Implementation: crypto_aead/scream10v2/sse
Compiler: clang -O3 -fomit-frame-pointer -Qunused-arguments
scream.c: scream.c:185:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:186:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: C = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:190:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: B = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:191:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: D = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:198:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: A ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^
scream.c: scream.c:199:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: C ^= __builtin_ia32_pshufb128(table, in[2]);
scream.c: ^
scream.c: scream.c:203:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: B ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^
scream.c: scream.c:204:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: D ^= __builtin_ia32_pshufb128(table, in[2]);
scream.c: ^
scream.c: scream.c:211:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
clang -O3 -fomit-frame-pointer -Qunused-arguments sse
clang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments sse
clang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments sse
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments sse

Compiler output

Implementation: crypto_aead/scream10v2/sse
Compiler: gcc -m64 -march=barcelona -O2 -fomit-frame-pointer
scream.c: scream.c: In function 'LBox16P':
scream.c: scream.c:185:10: warning: implicit declaration of function '__builtin_ia32_pshufb128' [-Wimplicit-function-declaration]
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^~~~~~~~~~~~~~~~~~~~~~~~
scream.c: scream.c:185:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:186:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: C = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:190:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: B = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:191:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: D = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:198:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: A ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: scream.c:199:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: C ^= __builtin_ia32_pshufb128(table, in[2]);
scream.c: ^~
scream.c: scream.c:203:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: B ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: ...
scream.c: scream.c: In function 'LBox16P':
scream.c: scream.c:185:10: warning: implicit declaration of function '__builtin_ia32_pshufb128' [-Wimplicit-function-declaration]
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^~~~~~~~~~~~~~~~~~~~~~~~
scream.c: scream.c:185:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:186:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: C = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:190:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: B = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:191:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: D = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:198:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: A ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: scream.c:199:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: C ^= __builtin_ia32_pshufb128(table, in[2]);
scream.c: ^~
scream.c: scream.c:203:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: B ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m64 -march=barcelona -O2 -fomit-frame-pointer sse
gcc -m64 -march=barcelona -O3 -fomit-frame-pointer sse
gcc -m64 -march=barcelona -O -fomit-frame-pointer sse
gcc -m64 -march=barcelona -Os -fomit-frame-pointer sse