Implementation notes: amd64, skylake, crypto_aead/scream10v3

Computer: skylake
Architecture: amd64
CPU ID: GenuineIntel-000506e3-bfebfbff
SUPERCOP version: 20161026
Operation: crypto_aead
Primitive: scream10v3
TimeImplementationCompilerBenchmark dateSUPERCOP version
59974ssegcc -m64 -march=core-avx-i -O3 -fomit-frame-pointer2016121620161026
59978ssegcc -m64 -march=corei7-avx -O3 -fomit-frame-pointer2016121620161026
60294ssegcc -m64 -march=core-avx2 -O3 -fomit-frame-pointer2016121620161026
60424ssegcc -m64 -march=native -mtune=native -O3 -fomit-frame-pointer2016121620161026
61024ssegcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016121620161026
63824ssegcc -m64 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2016121620161026
63898ssegcc -m64 -march=core2 -O3 -fomit-frame-pointer2016121620161026
64300ssegcc -m64 -march=core2 -msse4 -O3 -fomit-frame-pointer2016121620161026
64412ssegcc -m64 -march=corei7 -O3 -fomit-frame-pointer2016121620161026
65028sseclang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments2016121620161026
65242sseclang -O3 -fwrapv -march=x86-64 -mcpu=core-avx2 -mavx2 -maes -mpclmul -fomit-frame-pointer -Qunused-arguments2016121620161026
65352ssegcc -m64 -march=native -mtune=native -O2 -fomit-frame-pointer2016121620161026
65412sseclang -O3 -fwrapv -mavx2 -fomit-frame-pointer -Qunused-arguments2016121620161026
65516ssegcc -m64 -march=core-avx2 -O2 -fomit-frame-pointer2016121620161026
65882ssegcc -m64 -march=corei7-avx -O2 -fomit-frame-pointer2016121620161026
66142ssegcc -m64 -march=core-avx2 -Os -fomit-frame-pointer2016121620161026
66164ssegcc -m64 -march=native -mtune=native -Os -fomit-frame-pointer2016121620161026
66232ssegcc -m64 -march=core-avx-i -Os -fomit-frame-pointer2016121620161026
66518sseclang -O3 -fwrapv -mavx -fomit-frame-pointer -Qunused-arguments2016121620161026
66706ssegcc -m64 -march=core-avx-i -O2 -fomit-frame-pointer2016121620161026
66732sseclang -O3 -fwrapv -mavx -maes -mpclmul -fomit-frame-pointer -Qunused-arguments2016121620161026
67044ssegcc -m64 -march=corei7-avx -Os -fomit-frame-pointer2016121620161026
67348sseclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121620161026
68404ssegcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016121620161026
69314ssegcc -m64 -march=corei7 -O2 -fomit-frame-pointer2016121620161026
69386ssegcc -m64 -march=core2 -O2 -fomit-frame-pointer2016121620161026
69874ssegcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016121620161026
70268ssegcc -m64 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2016121620161026
71010ssegcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016121620161026
71036ssegcc -m64 -march=corei7-avx -O -fomit-frame-pointer2016121620161026
71150ssegcc -m64 -march=native -mtune=native -O -fomit-frame-pointer2016121620161026
71212ssegcc -m64 -march=core-avx2 -O -fomit-frame-pointer2016121620161026
71358ssegcc -m64 -march=core2 -msse4 -O2 -fomit-frame-pointer2016121620161026
71442ssegcc -m64 -march=core-avx-i -O -fomit-frame-pointer2016121620161026
74986ssegcc -m64 -march=core2 -msse4.1 -O -fomit-frame-pointer2016121620161026
75362ssegcc -m64 -march=core2 -msse4 -O -fomit-frame-pointer2016121620161026
76224ssegcc -m64 -march=corei7 -O -fomit-frame-pointer2016121620161026
76500ssegcc -m64 -march=core2 -O -fomit-frame-pointer2016121620161026
102432ssegcc -m64 -march=corei7 -Os -fomit-frame-pointer2016121620161026
102508ssegcc -m64 -march=core2 -Os -fomit-frame-pointer2016121620161026
102678ssegcc -m64 -march=core2 -msse4 -Os -fomit-frame-pointer2016121620161026
103088ssegcc -m64 -march=core2 -msse4.1 -Os -fomit-frame-pointer2016121620161026
275398refgcc -funroll-loops -march=nocona -O3 -fomit-frame-pointer2016121620161026
275786refgcc -funroll-loops -m64 -march=nocona -O3 -fomit-frame-pointer2016121620161026
277660refgcc -march=nocona -O3 -fomit-frame-pointer2016121620161026
278236refgcc -m64 -march=core2 -msse4 -O3 -fomit-frame-pointer2016121620161026
278526refgcc -m64 -march=corei7 -O3 -fomit-frame-pointer2016121620161026
279876refgcc -m64 -march=core2 -O3 -fomit-frame-pointer2016121620161026
280000refgcc -m64 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2016121620161026
280606refgcc -m64 -march=nocona -O3 -fomit-frame-pointer2016121620161026
280630refgcc -m64 -march=core-avx-i -O3 -fomit-frame-pointer2016121620161026
281432refgcc -m64 -march=corei7-avx -O3 -fomit-frame-pointer2016121620161026
281552refgcc -m64 -march=native -mtune=native -O3 -fomit-frame-pointer2016121620161026
282802refgcc -march=k8 -O3 -fomit-frame-pointer2016121620161026
283130refgcc -m64 -march=core-avx2 -O3 -fomit-frame-pointer2016121620161026
283596refgcc -m64 -march=barcelona -O3 -fomit-frame-pointer2016121620161026
283958refgcc -funroll-loops -O2 -fomit-frame-pointer2016121620161026
283970refgcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer2016121620161026
283972refgcc -funroll-loops -march=k8 -O2 -fomit-frame-pointer2016121620161026
284016refgcc -funroll-loops -m64 -march=barcelona -O2 -fomit-frame-pointer2016121620161026
284026refgcc -m64 -march=k8 -O3 -fomit-frame-pointer2016121620161026
284222refgcc -funroll-loops -march=nocona -O2 -fomit-frame-pointer2016121620161026
284878refgcc -march=barcelona -O3 -fomit-frame-pointer2016121620161026
285180refgcc -funroll-loops -m64 -march=k8 -O2 -fomit-frame-pointer2016121620161026
285340refgcc -funroll-loops -m64 -march=nocona -O2 -fomit-frame-pointer2016121620161026
285792refgcc -funroll-loops -m64 -O2 -fomit-frame-pointer2016121620161026
285870refgcc -funroll-loops -march=barcelona -O3 -fomit-frame-pointer2016121620161026
286168refgcc -funroll-loops -march=barcelona -O2 -fomit-frame-pointer2016121620161026
286186refgcc -funroll-loops -march=k8 -O3 -fomit-frame-pointer2016121620161026
286238refgcc -funroll-loops -m64 -march=barcelona -O3 -fomit-frame-pointer2016121620161026
287598refgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016121620161026
288442refgcc -funroll-loops -m64 -march=k8 -O3 -fomit-frame-pointer2016121620161026
289344refgcc -funroll-loops -m64 -O -fomit-frame-pointer2016121620161026
289412refgcc -funroll-loops -march=barcelona -O -fomit-frame-pointer2016121620161026
289582refgcc -funroll-loops -m64 -march=nocona -O -fomit-frame-pointer2016121620161026
289720refgcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer2016121620161026
289772refgcc -funroll-loops -m64 -march=barcelona -O -fomit-frame-pointer2016121620161026
290252refgcc -funroll-loops -march=k8 -O -fomit-frame-pointer2016121620161026
290336refgcc -funroll-loops -march=nocona -O -fomit-frame-pointer2016121620161026
291276refgcc -funroll-loops -O -fomit-frame-pointer2016121620161026
292068refgcc -funroll-loops -m64 -march=k8 -O -fomit-frame-pointer2016121620161026
295254refgcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer2016121620161026
296544refgcc -funroll-loops -O3 -fomit-frame-pointer2016121620161026
297558refgcc -fno-schedule-insns -O3 -fomit-frame-pointer2016121620161026
297726refgcc -O3 -fomit-frame-pointer2016121620161026
297752refgcc -m64 -O3 -fomit-frame-pointer2016121620161026
299628refgcc -funroll-loops -m64 -O3 -fomit-frame-pointer2016121620161026
410662refclang -O3 -fwrapv -mavx2 -fomit-frame-pointer -Qunused-arguments2016121620161026
410710refclang -O3 -fwrapv -march=x86-64 -mcpu=core-avx2 -mavx2 -maes -mpclmul -fomit-frame-pointer -Qunused-arguments2016121620161026
411054refclang -O3 -fwrapv -mavx -fomit-frame-pointer -Qunused-arguments2016121620161026
413268refclang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments2016121620161026
415096refclang -O3 -fwrapv -mavx -maes -mpclmul -fomit-frame-pointer -Qunused-arguments2016121620161026
415864refclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121620161026
425054refgcc -march=barcelona -O2 -fomit-frame-pointer2016121620161026
425470refgcc -m64 -march=barcelona -O2 -fomit-frame-pointer2016121620161026
427466refgcc -march=k8 -O2 -fomit-frame-pointer2016121620161026
427488refgcc -m64 -march=nocona -O2 -fomit-frame-pointer2016121620161026
427708refgcc -m64 -march=k8 -O2 -fomit-frame-pointer2016121620161026
428526refgcc -m64 -march=core-avx-i -O2 -fomit-frame-pointer2016121620161026
429014refgcc -m64 -march=corei7-avx -O2 -fomit-frame-pointer2016121620161026
429048refgcc -m64 -march=core2 -O2 -fomit-frame-pointer2016121620161026
430128refgcc -m64 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2016121620161026
430532refgcc -m64 -march=native -mtune=native -O2 -fomit-frame-pointer2016121620161026
430686refgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016121620161026
430840refgcc -m64 -march=core2 -msse4 -O2 -fomit-frame-pointer2016121620161026
430884refgcc -O2 -fomit-frame-pointer2016121620161026
430896refgcc -m64 -march=corei7 -O2 -fomit-frame-pointer2016121620161026
431386refgcc -m64 -march=core-avx2 -O2 -fomit-frame-pointer2016121620161026
431844refgcc -fno-schedule-insns -O2 -fomit-frame-pointer2016121620161026
431990refgcc -m64 -O2 -fomit-frame-pointer2016121620161026
433480refgcc -m64 -march=core-avx2 -O -fomit-frame-pointer2016121620161026
433594refgcc -march=nocona -O2 -fomit-frame-pointer2016121620161026
433974refgcc -m64 -march=k8 -O -fomit-frame-pointer2016121620161026
434994refgcc -march=barcelona -O -fomit-frame-pointer2016121620161026
435422refgcc -O -fomit-frame-pointer2016121620161026
435592refgcc -m64 -march=corei7 -O -fomit-frame-pointer2016121620161026
436114refgcc -fno-schedule-insns -O -fomit-frame-pointer2016121620161026
436182refgcc -m64 -march=barcelona -O -fomit-frame-pointer2016121620161026
436262refgcc -march=k8 -O -fomit-frame-pointer2016121620161026
436968refgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016121620161026
437678refgcc -m64 -march=core-avx-i -O -fomit-frame-pointer2016121620161026
437802refgcc -m64 -march=corei7-avx -O -fomit-frame-pointer2016121620161026
437896refgcc -m64 -march=core2 -msse4.1 -O -fomit-frame-pointer2016121620161026
438020refgcc -m64 -march=core2 -msse4 -O -fomit-frame-pointer2016121620161026
438228refgcc -m64 -O -fomit-frame-pointer2016121620161026
438390refgcc -m64 -march=core2 -O -fomit-frame-pointer2016121620161026
440382refgcc -m64 -march=native -mtune=native -O -fomit-frame-pointer2016121620161026
442352refgcc -m64 -march=nocona -O -fomit-frame-pointer2016121620161026
445280refgcc -march=nocona -O -fomit-frame-pointer2016121620161026
449700refclang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121620161026
451752refclang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121620161026
452636refclang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016121620161026
452692refclang -O3 -fomit-frame-pointer -Qunused-arguments2016121620161026
670936refgcc -fno-schedule-insns -Os -fomit-frame-pointer2016121620161026
671576refgcc -march=k8 -Os -fomit-frame-pointer2016121620161026
673552refgcc -m64 -Os -fomit-frame-pointer2016121620161026
675240refgcc -m64 -march=core2 -msse4.1 -Os -fomit-frame-pointer2016121620161026
675316refgcc -m64 -march=core2 -msse4 -Os -fomit-frame-pointer2016121620161026
676552refgcc -Os -fomit-frame-pointer2016121620161026
676654refgcc -march=barcelona -Os -fomit-frame-pointer2016121620161026
677922refgcc -march=nocona -Os -fomit-frame-pointer2016121620161026
677976refgcc -m64 -march=core2 -Os -fomit-frame-pointer2016121620161026
678116refgcc -m64 -march=corei7 -Os -fomit-frame-pointer2016121620161026
678242refgcc -m64 -march=core-avx2 -Os -fomit-frame-pointer2016121620161026
679364refgcc -m64 -march=native -mtune=native -Os -fomit-frame-pointer2016121620161026
679876refgcc -m64 -march=core-avx-i -Os -fomit-frame-pointer2016121620161026
680294refgcc -m64 -march=barcelona -Os -fomit-frame-pointer2016121620161026
680868refgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016121620161026
681114refgcc -m64 -march=k8 -Os -fomit-frame-pointer2016121620161026
684252refgcc -m64 -march=nocona -Os -fomit-frame-pointer2016121620161026
684520refgcc -m64 -march=corei7-avx -Os -fomit-frame-pointer2016121620161026
857246refgcc -funroll-loops -m64 -march=k8 -Os -fomit-frame-pointer2016121620161026
858102refgcc -funroll-loops -march=barcelona -Os -fomit-frame-pointer2016121620161026
858238refgcc -funroll-loops -Os -fomit-frame-pointer2016121620161026
860076refgcc -funroll-loops -march=k8 -Os -fomit-frame-pointer2016121620161026
860794refgcc -funroll-loops -march=nocona -Os -fomit-frame-pointer2016121620161026
861344refgcc -funroll-loops -m64 -march=nocona -Os -fomit-frame-pointer2016121620161026
861378refgcc -funroll-loops -m64 -march=barcelona -Os -fomit-frame-pointer2016121620161026
862620refgcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer2016121620161026
864390refgcc -funroll-loops -m64 -Os -fomit-frame-pointer2016121620161026
1476876refgcc2016121620161026
1477772refcc2016121620161026
1496270refgcc -funroll-loops2016121620161026

Compiler output

Implementation: crypto_aead/scream10v3/sse
Compiler: cc
scream.c: scream.c: In function 'LBox16P':
scream.c: scream.c:202:10: warning: implicit declaration of function '__builtin_ia32_pshufb128' [-Wimplicit-function-declaration]
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^~~~~~~~~~~~~~~~~~~~~~~~
scream.c: scream.c:202:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:203:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: C = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:207:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: B = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:208:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: D = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:215:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: A ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: scream.c:216:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: C ^= __builtin_ia32_pshufb128(table, in[2]);
scream.c: ^~
scream.c: scream.c:220:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: B ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: ...

Number of similar (compiler,implementation) pairs: 71, namely:
CompilerImplementations
cc sse
gcc sse
gcc -O2 -fomit-frame-pointer sse
gcc -O3 -fomit-frame-pointer sse
gcc -O -fomit-frame-pointer sse
gcc -Os -fomit-frame-pointer sse
gcc -fno-schedule-insns -O2 -fomit-frame-pointer sse
gcc -fno-schedule-insns -O3 -fomit-frame-pointer sse
gcc -fno-schedule-insns -O -fomit-frame-pointer sse
gcc -fno-schedule-insns -Os -fomit-frame-pointer sse
gcc -funroll-loops sse
gcc -funroll-loops -O2 -fomit-frame-pointer sse
gcc -funroll-loops -O3 -fomit-frame-pointer sse
gcc -funroll-loops -O -fomit-frame-pointer sse
gcc -funroll-loops -Os -fomit-frame-pointer sse
gcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer sse
gcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer sse
gcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer sse
gcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer sse
gcc -funroll-loops -m64 -O2 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -O3 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -O -fomit-frame-pointer sse
gcc -funroll-loops -m64 -Os -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=barcelona -O2 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=barcelona -O3 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=barcelona -O -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=barcelona -Os -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=k8 -O2 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=k8 -O3 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=k8 -O -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=k8 -Os -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=nocona -O2 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=nocona -O3 -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=nocona -O -fomit-frame-pointer sse
gcc -funroll-loops -m64 -march=nocona -Os -fomit-frame-pointer sse
gcc -funroll-loops -march=barcelona -O2 -fomit-frame-pointer sse
gcc -funroll-loops -march=barcelona -O3 -fomit-frame-pointer sse
gcc -funroll-loops -march=barcelona -O -fomit-frame-pointer sse
gcc -funroll-loops -march=barcelona -Os -fomit-frame-pointer sse
gcc -funroll-loops -march=k8 -O2 -fomit-frame-pointer sse
gcc -funroll-loops -march=k8 -O3 -fomit-frame-pointer sse
gcc -funroll-loops -march=k8 -O -fomit-frame-pointer sse
gcc -funroll-loops -march=k8 -Os -fomit-frame-pointer sse
gcc -funroll-loops -march=nocona -O2 -fomit-frame-pointer sse
gcc -funroll-loops -march=nocona -O3 -fomit-frame-pointer sse
gcc -funroll-loops -march=nocona -O -fomit-frame-pointer sse
gcc -funroll-loops -march=nocona -Os -fomit-frame-pointer sse
gcc -m64 -O2 -fomit-frame-pointer sse
gcc -m64 -O3 -fomit-frame-pointer sse
gcc -m64 -O -fomit-frame-pointer sse
gcc -m64 -Os -fomit-frame-pointer sse
gcc -m64 -march=k8 -O2 -fomit-frame-pointer sse
gcc -m64 -march=k8 -O3 -fomit-frame-pointer sse
gcc -m64 -march=k8 -O -fomit-frame-pointer sse
gcc -m64 -march=k8 -Os -fomit-frame-pointer sse
gcc -m64 -march=nocona -O2 -fomit-frame-pointer sse
gcc -m64 -march=nocona -O3 -fomit-frame-pointer sse
gcc -m64 -march=nocona -O -fomit-frame-pointer sse
gcc -m64 -march=nocona -Os -fomit-frame-pointer sse
gcc -march=barcelona -O2 -fomit-frame-pointer sse
gcc -march=barcelona -O3 -fomit-frame-pointer sse
gcc -march=barcelona -O -fomit-frame-pointer sse
gcc -march=barcelona -Os -fomit-frame-pointer sse
gcc -march=k8 -O2 -fomit-frame-pointer sse
gcc -march=k8 -O3 -fomit-frame-pointer sse
gcc -march=k8 -O -fomit-frame-pointer sse
gcc -march=k8 -Os -fomit-frame-pointer sse
gcc -march=nocona -O2 -fomit-frame-pointer sse
gcc -march=nocona -O3 -fomit-frame-pointer sse
gcc -march=nocona -O -fomit-frame-pointer sse
gcc -march=nocona -Os -fomit-frame-pointer sse

Compiler output

Implementation: crypto_aead/scream10v3/sse
Compiler: clang -O3 -fomit-frame-pointer -Qunused-arguments
scream.c: scream.c:202:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:203:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: C = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:207:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: B = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:208:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: D = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:215:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: A ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^
scream.c: scream.c:216:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: C ^= __builtin_ia32_pshufb128(table, in[2]);
scream.c: ^
scream.c: scream.c:220:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: B ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^
scream.c: scream.c:221:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: D ^= __builtin_ia32_pshufb128(table, in[2]);
scream.c: ^
scream.c: scream.c:228:10: error: '__builtin_ia32_pshufb128' needs target feature ssse3
scream.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
clang -O3 -fomit-frame-pointer -Qunused-arguments sse
clang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments sse
clang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments sse
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments sse

Compiler output

Implementation: crypto_aead/scream10v3/sse
Compiler: gcc -m64 -march=barcelona -O2 -fomit-frame-pointer
scream.c: scream.c: In function 'LBox16P':
scream.c: scream.c:202:10: warning: implicit declaration of function '__builtin_ia32_pshufb128' [-Wimplicit-function-declaration]
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^~~~~~~~~~~~~~~~~~~~~~~~
scream.c: scream.c:202:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:203:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: C = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:207:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: B = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:208:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: D = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:215:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: A ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: scream.c:216:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: C ^= __builtin_ia32_pshufb128(table, in[2]);
scream.c: ^~
scream.c: scream.c:220:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: B ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: ...
scream.c: scream.c: In function 'LBox16P':
scream.c: scream.c:202:10: warning: implicit declaration of function '__builtin_ia32_pshufb128' [-Wimplicit-function-declaration]
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^~~~~~~~~~~~~~~~~~~~~~~~
scream.c: scream.c:202:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: A = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:203:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: C = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:207:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: B = __builtin_ia32_pshufb128(table, t0);
scream.c: ^
scream.c: scream.c:208:8: error: incompatible types when assigning to type 'v16qi {aka __vector(16) char}' from type 'int'
scream.c: D = __builtin_ia32_pshufb128(table, t1);
scream.c: ^
scream.c: scream.c:215:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: A ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: scream.c:216:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: C ^= __builtin_ia32_pshufb128(table, in[2]);
scream.c: ^~
scream.c: scream.c:220:7: error: conversion of scalar 'int' to vector 'v16qi {aka __vector(16) char}' involves truncation
scream.c: B ^= __builtin_ia32_pshufb128(table, in[0]);
scream.c: ^~
scream.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m64 -march=barcelona -O2 -fomit-frame-pointer sse
gcc -m64 -march=barcelona -O3 -fomit-frame-pointer sse
gcc -m64 -march=barcelona -O -fomit-frame-pointer sse
gcc -m64 -march=barcelona -Os -fomit-frame-pointer sse