Implementation notes: amd64, cel02, crypto_stream/speck128256ctr

Computer: cel02
Architecture: amd64
CPU ID: GenuineIntel-00050657-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_stream
Primitive: speck128256ctr
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
165837859 0 050924 816 856T:avx512gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
168239279 0 055933 824 888T:avx512gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
168838423 0 051844 816 856T:avx512gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
173237634 0 049536 800 824T:avx512gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
212835238 0 048260 816 856T:avx2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
216036382 0 049764 816 856T:avx2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
349835087 0 047000 800 824T:avx2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
378837305 0 053933 824 888T:avx2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
400642530 0 054548 792 800T:avx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
475436579 0 048548 792 800T:sse4clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
682430151 0 042056 800 824T:sse4gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
720230056 0 043092 816 856T:sse4gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
746229660 0 043044 816 856T:sse4gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
770243644 0 060301 824 888T:sse4gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55d55e385370: v4i64 = X86ISD::VTRUNC 0x55d55e385240
try.c: 0x55d55e385240: v16i32 = vselect 0x55d55e3a7dc0, 0x55d55e326be0, 0x55d55e385110
try.c: 0x55d55e3a7dc0: v4i1 = X86ISD::PCMPGTM 0x55d55e37e920, 0x55d55e37a4b0
try.c: 0x55d55e37e920: v4i64 = X86ISD::VBROADCAST 0x55d55e33a9a0
try.c: 0x55d55e33a9a0: i64,ch = load<LD8[%lsr.iv6971]> 0x55d55e28f9b0, 0x55d55e348e10, undef:i64
try.c: 0x55d55e348e10: i64,ch = CopyFromReg 0x55d55e28f9b0, Register:i64 %vreg50
try.c: 0x55d55e37a710: i64 = Register %vreg50
try.c: 0x55d55e325250: i64 = undef
try.c: 0x55d55e37a4b0: v4i64,ch = CopyFromReg 0x55d55e28f9b0, Register:v4i64 %vreg13
try.c: 0x55d55e37f170: v4i64 = Register %vreg13
try.c: 0x55d55e326be0: v16i32 = X86ISD::VBROADCAST 0x55d55e37eb80
try.c: 0x55d55e37eb80: i32,ch = load<LD4[ConstantPool]> 0x55d55e28f9b0, 0x55d55e339f80, undef:i64
try.c: 0x55d55e339f80: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55d55e36d2e0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55d55e325250: i64 = undef
try.c: 0x55d55e385110: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55d55e384fe0: i32 = Constant<0>
try.c: 0x55d55e384fe0: i32 = Constant<0>
try.c: 0x55d55e384fe0: i32 = Constant<0>
try.c: 0x55d55e384fe0: i32 = Constant<0>
try.c: 0x55d55e384fe0: i32 = Constant<0>
try.c: 0x55d55e384fe0: i32 = Constant<0>
try.c: 0x55d55e384fe0: i32 = Constant<0>
try.c: 0x55d55e384fe0: i32 = Constant<0>
try.c: 0x55d55e384fe0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x5623b522a320: v4i64 = X86ISD::VTRUNC 0x5623b522a1f0
try.c: 0x5623b522a1f0: v16i32 = vselect 0x5623b5227d10, 0x5623b51c5280, 0x5623b522a0c0
try.c: 0x5623b5227d10: v4i1 = X86ISD::PCMPGTM 0x5623b5213f40, 0x5623b520fd10
try.c: 0x5623b5213f40: v4i64 = X86ISD::VBROADCAST 0x5623b51c5740
try.c: 0x5623b51c5740: i64,ch = load<LD8[%lsr.iv6971]> 0x5623b510ea10, 0x5623b51b5f10, undef:i64
try.c: 0x5623b51b5f10: i64,ch = CopyFromReg 0x5623b510ea10, Register:i64 %vreg50
try.c: 0x5623b520ff70: i64 = Register %vreg50
try.c: 0x5623b518d230: i64 = undef
try.c: 0x5623b520fd10: v4i64,ch = CopyFromReg 0x5623b510ea10, Register:v4i64 %vreg13
try.c: 0x5623b5214790: v4i64 = Register %vreg13
try.c: 0x5623b51c5280: v16i32 = X86ISD::VBROADCAST 0x5623b52141a0
try.c: 0x5623b52141a0: i32,ch = load<LD4[ConstantPool]> 0x5623b510ea10, 0x5623b51b44e0, undef:i64
try.c: 0x5623b51b44e0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x5623b518dbb0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x5623b518d230: i64 = undef
try.c: 0x5623b522a0c0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x5623b5229f90: i32 = Constant<0>
try.c: 0x5623b5229f90: i32 = Constant<0>
try.c: 0x5623b5229f90: i32 = Constant<0>
try.c: 0x5623b5229f90: i32 = Constant<0>
try.c: 0x5623b5229f90: i32 = Constant<0>
try.c: 0x5623b5229f90: i32 = Constant<0>
try.c: 0x5623b5229f90: i32 = Constant<0>
try.c: 0x5623b5229f90: i32 = Constant<0>
try.c: 0x5623b5229f90: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x559fe374b000: v4i64 = X86ISD::VTRUNC 0x559fe374aed0
try.c: 0x559fe374aed0: v16i32 = vselect 0x559fe3736ae0, 0x559fe36d5640, 0x559fe374ada0
try.c: 0x559fe3736ae0: v4i1 = X86ISD::PCMPGTM 0x559fe3730680, 0x559fe372c210
try.c: 0x559fe3730680: v4i64 = X86ISD::VBROADCAST 0x559fe36de710
try.c: 0x559fe36de710: i64,ch = load<LD8[%lsr.iv6971]> 0x559fe3641950, 0x559fe3727070, undef:i64
try.c: 0x559fe3727070: i64,ch = CopyFromReg 0x559fe3641950, Register:i64 %vreg50
try.c: 0x559fe372c470: i64 = Register %vreg50
try.c: 0x559fe36d3cb0: i64 = undef
try.c: 0x559fe372c210: v4i64,ch = CopyFromReg 0x559fe3641950, Register:v4i64 %vreg13
try.c: 0x559fe3730ed0: v4i64 = Register %vreg13
try.c: 0x559fe36d5640: v16i32 = X86ISD::VBROADCAST 0x559fe37308e0
try.c: 0x559fe37308e0: i32,ch = load<LD4[ConstantPool]> 0x559fe3641950, 0x559fe36ddcf0, undef:i64
try.c: 0x559fe36ddcf0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x559fe3716470: i64 = TargetConstantPool<i32 1> 0
try.c: 0x559fe36d3cb0: i64 = undef
try.c: 0x559fe374ada0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x559fe374ac70: i32 = Constant<0>
try.c: 0x559fe374ac70: i32 = Constant<0>
try.c: 0x559fe374ac70: i32 = Constant<0>
try.c: 0x559fe374ac70: i32 = Constant<0>
try.c: 0x559fe374ac70: i32 = Constant<0>
try.c: 0x559fe374ac70: i32 = Constant<0>
try.c: 0x559fe374ac70: i32 = Constant<0>
try.c: 0x559fe374ac70: i32 = Constant<0>
try.c: 0x559fe374ac70: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
stream.c: stream.c:301:3: error: always_inline function '_mm256_set_epi64x' requires target feature 'sse4.2', but would be inlined into function 'ExpandKey' that is compiled without support for 'sse4.2'
stream.c: EK(A,B,C,D,rk,key);
stream.c: ^
stream.c: ./Speck128256AVX2.h:55:28: note: expanded from macro 'EK'
stream.c: #define EK(A,B,C,D,k,key) (RK(B,A,k,key,0), RK(C,A,k,key,1), RK(D,A,k,key,2), RK(B,A,k,key,3), RK(C,A,k,key,4), RK(D,A,k,key,5), RK(B,A,k,key,6), \
stream.c: ^
stream.c: ./Speck128256AVX2.h:53:28: note: expanded from macro 'RK'
stream.c: #define RK(X,Y,k,key,i) (SET1(k[i],Y), key[i]=Y, X=RCS(X,8), X+=Y, X^=i, Y=LCS(Y,3), Y^=X)
stream.c: ^
stream.c: ./Intrinsics_AVX2_128block.h:25:22: note: expanded from macro 'SET1'
stream.c: #define SET1(X,c) (X=SET(c,c,c,c))
stream.c: ^
stream.c: ./Intrinsics_AVX2_128block.h:24:13: note: expanded from macro 'SET'
stream.c: #define SET _mm256_set_epi64x
stream.c: ^
stream.c: stream.c:301:3: error: always_inline function '_mm256_set_epi64x' requires target feature 'sse4.2', but would be inlined into function 'ExpandKey' that is compiled without support for 'sse4.2'
stream.c: ./Speck128256AVX2.h:55:46: note: expanded from macro 'EK'
stream.c: #define EK(A,B,C,D,k,key) (RK(B,A,k,key,0), RK(C,A,k,key,1), RK(D,A,k,key,2), RK(B,A,k,key,3), RK(C,A,k,key,4), RK(D,A,k,key,5), RK(B,A,k,key,6), \
stream.c: ^
stream.c: ./Speck128256AVX2.h:53:28: note: expanded from macro 'RK'
stream.c: #define RK(X,Y,k,key,i) (SET1(k[i],Y), key[i]=Y, X=RCS(X,8), X+=Y, X^=i, Y=LCS(Y,3), Y^=X)
stream.c: ^
stream.c: ./Intrinsics_AVX2_128block.h:25:22: note: expanded from macro 'SET1'
stream.c: #define SET1(X,c) (X=SET(c,c,c,c))
stream.c: ^
stream.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx512
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
stream.c: stream.c:139:5: warning: implicit declaration of function '_mm512_set_epi64' is invalid in C99 [-Wimplicit-function-declaration]
stream.c: SET1(X[0],nonce[1]); SET8(Y[0],nonce[0]);
stream.c: ^
stream.c: ./Intrinsics_AVX512_128block.h:25:22: note: expanded from macro 'SET1'
stream.c: #define SET1(X,c) (X=SET(c,c,c,c,c,c,c,c))
stream.c: ^
stream.c: ./Intrinsics_AVX512_128block.h:16:13: note: expanded from macro 'SET'
stream.c: #define SET _mm512_set_epi64
stream.c: ^
stream.c: stream.c:139:5: error: assigning to '__m512i' (vector of 8 'long long' values) from incompatible type 'int'
stream.c: SET1(X[0],nonce[1]); SET8(Y[0],nonce[0]);
stream.c: ^~~~~~~~~~~~~~~~~~~
stream.c: ./Intrinsics_AVX512_128block.h:25:21: note: expanded from macro 'SET1'
stream.c: #define SET1(X,c) (X=SET(c,c,c,c,c,c,c,c))
stream.c: ^~~~~~~~~~~~~~~~~~~~~
stream.c: stream.c:139:26: error: assigning to '__m512i' (vector of 8 'long long' values) from incompatible type 'int'
stream.c: SET1(X[0],nonce[1]); SET8(Y[0],nonce[0]);
stream.c: ^~~~~~~~~~~~~~~~~~~
stream.c: ./Intrinsics_AVX512_128block.h:26:21: note: expanded from macro 'SET8'
stream.c: #define SET8(X,c) (X=SET(c,c,c,c,c,c,c,c), X=ADD(X,_q8))
stream.c: ^~~~~~~~~~~~~~~~~~~~~
stream.c: stream.c:139:26: error: passing 'int' to parameter of incompatible type '__m512i' (vector of 8 'long long' values)
stream.c: SET1(X[0],nonce[1]); SET8(Y[0],nonce[0]);
stream.c: ^~~~~~~~~~~~~~~~~~~
stream.c: ./Intrinsics_AVX512_128block.h:26:52: note: expanded from macro 'SET8'
stream.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512

Compiler output

Implementation: T:sse4
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55bed0e4fae0: v4i64 = X86ISD::VTRUNC 0x55bed0e4f9b0
try.c: 0x55bed0e4f9b0: v16i32 = vselect 0x55bed0e47950, 0x55bed0ded380, 0x55bed0e4f880
try.c: 0x55bed0e47950: v4i1 = X86ISD::PCMPGTM 0x55bed0e46940, 0x55bed0e424d0
try.c: 0x55bed0e46940: v4i64 = X86ISD::VBROADCAST 0x55bed0dfcc60
try.c: 0x55bed0dfcc60: i64,ch = load<LD8[%lsr.iv6971]> 0x55bed0d57950, 0x55bed0e3d330, undef:i64
try.c: 0x55bed0e3d330: i64,ch = CopyFromReg 0x55bed0d57950, Register:i64 %vreg50
try.c: 0x55bed0e42730: i64 = Register %vreg50
try.c: 0x55bed0dfe130: i64 = undef
try.c: 0x55bed0e424d0: v4i64,ch = CopyFromReg 0x55bed0d57950, Register:v4i64 %vreg13
try.c: 0x55bed0e47190: v4i64 = Register %vreg13
try.c: 0x55bed0ded380: v16i32 = X86ISD::VBROADCAST 0x55bed0e46ba0
try.c: 0x55bed0e46ba0: i32,ch = load<LD4[ConstantPool]> 0x55bed0d57950, 0x55bed0dfc240, undef:i64
try.c: 0x55bed0dfc240: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55bed0e39dc0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55bed0dfe130: i64 = undef
try.c: 0x55bed0e4f880: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55bed0e4f750: i32 = Constant<0>
try.c: 0x55bed0e4f750: i32 = Constant<0>
try.c: 0x55bed0e4f750: i32 = Constant<0>
try.c: 0x55bed0e4f750: i32 = Constant<0>
try.c: 0x55bed0e4f750: i32 = Constant<0>
try.c: 0x55bed0e4f750: i32 = Constant<0>
try.c: 0x55bed0e4f750: i32 = Constant<0>
try.c: 0x55bed0e4f750: i32 = Constant<0>
try.c: 0x55bed0e4f750: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse4

Compiler output

Implementation: T:sse4
Security model: timingleaks
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x5636b7f297b0: v4i64 = X86ISD::VTRUNC 0x5636b7f29680
try.c: 0x5636b7f29680: v16i32 = vselect 0x5636b7f271a0, 0x5636b7eaa220, 0x5636b7f29550
try.c: 0x5636b7f271a0: v4i1 = X86ISD::PCMPGTM 0x5636b7f129d0, 0x5636b7f0e560
try.c: 0x5636b7f129d0: v4i64 = X86ISD::VBROADCAST 0x5636b7eaa6e0
try.c: 0x5636b7eaa6e0: i64,ch = load<LD8[%lsr.iv6971]> 0x5636b7e0ba40, 0x5636b7eaf000, undef:i64
try.c: 0x5636b7eaf000: i64,ch = CopyFromReg 0x5636b7e0ba40, Register:i64 %vreg50
try.c: 0x5636b7f0e7c0: i64 = Register %vreg50
try.c: 0x5636b7eb29b0: i64 = undef
try.c: 0x5636b7f0e560: v4i64,ch = CopyFromReg 0x5636b7e0ba40, Register:v4i64 %vreg13
try.c: 0x5636b7f13220: v4i64 = Register %vreg13
try.c: 0x5636b7eaa220: v16i32 = X86ISD::VBROADCAST 0x5636b7f12c30
try.c: 0x5636b7f12c30: i32,ch = load<LD4[ConstantPool]> 0x5636b7e0ba40, 0x5636b7eacdc0, undef:i64
try.c: 0x5636b7eacdc0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x5636b7eb3330: i64 = TargetConstantPool<i32 1> 0
try.c: 0x5636b7eb29b0: i64 = undef
try.c: 0x5636b7f29550: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x5636b7f29420: i32 = Constant<0>
try.c: 0x5636b7f29420: i32 = Constant<0>
try.c: 0x5636b7f29420: i32 = Constant<0>
try.c: 0x5636b7f29420: i32 = Constant<0>
try.c: 0x5636b7f29420: i32 = Constant<0>
try.c: 0x5636b7f29420: i32 = Constant<0>
try.c: 0x5636b7f29420: i32 = Constant<0>
try.c: 0x5636b7f29420: i32 = Constant<0>
try.c: 0x5636b7f29420: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse4

Compiler output

Implementation: T:sse4
Security model: timingleaks
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55bdcfc72270: v4i64 = X86ISD::VTRUNC 0x55bdcfc72140
try.c: 0x55bdcfc72140: v16i32 = vselect 0x55bdcfc5e650, 0x55bdcfbff2a0, 0x55bdcfc72010
try.c: 0x55bdcfc5e650: v4i1 = X86ISD::PCMPGTM 0x55bdcfc56a90, 0x55bdcfc52620
try.c: 0x55bdcfc56a90: v4i64 = X86ISD::VBROADCAST 0x55bdcfbfc440
try.c: 0x55bdcfbfc440: i64,ch = load<LD8[%lsr.iv6971]> 0x55bdcfb67950, 0x55bdcfc41410, undef:i64
try.c: 0x55bdcfc41410: i64,ch = CopyFromReg 0x55bdcfb67950, Register:i64 %vreg50
try.c: 0x55bdcfc52880: i64 = Register %vreg50
try.c: 0x55bdcfbfd910: i64 = undef
try.c: 0x55bdcfc52620: v4i64,ch = CopyFromReg 0x55bdcfb67950, Register:v4i64 %vreg13
try.c: 0x55bdcfc572e0: v4i64 = Register %vreg13
try.c: 0x55bdcfbff2a0: v16i32 = X86ISD::VBROADCAST 0x55bdcfc56cf0
try.c: 0x55bdcfc56cf0: i32,ch = load<LD4[ConstantPool]> 0x55bdcfb67950, 0x55bdcfbfa780, undef:i64
try.c: 0x55bdcfbfa780: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55bdcfc3c1e0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55bdcfbfd910: i64 = undef
try.c: 0x55bdcfc72010: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55bdcfc71ee0: i32 = Constant<0>
try.c: 0x55bdcfc71ee0: i32 = Constant<0>
try.c: 0x55bdcfc71ee0: i32 = Constant<0>
try.c: 0x55bdcfc71ee0: i32 = Constant<0>
try.c: 0x55bdcfc71ee0: i32 = Constant<0>
try.c: 0x55bdcfc71ee0: i32 = Constant<0>
try.c: 0x55bdcfc71ee0: i32 = Constant<0>
try.c: 0x55bdcfc71ee0: i32 = Constant<0>
try.c: 0x55bdcfc71ee0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse4

Compiler output

Implementation: T:sse4
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
stream.c: stream.c:118:21: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'Encrypt' that is compiled without support for 'ssse3'
stream.c: if (numbytes==32) Enc(X,Y,rk,2);
stream.c: ^
stream.c: ./Speck128256SSE4.h:42:23: note: expanded from macro 'Enc'
stream.c: #define Enc(X,Y,k,n) (Rx##n(X,Y,k[0]), Rx##n(X,Y,k[1]), Rx##n(X,Y,k[2]), Rx##n(X,Y,k[3]), Rx##n(X,Y,k[4]), Rx##n(X,Y,k[5]), Rx##n(X,Y,k[6]), Rx##n(X,Y,k[7]), \
stream.c: ^
stream.c: <scratch space>:74:1: note: expanded from here
stream.c: Rx2
stream.c: ^
stream.c: ./Speck128256SSE4.h:25:21: note: expanded from macro 'Rx2'
stream.c: #define Rx2(X,Y,k) (R(X[0],Y[0],k))
stream.c: ^
stream.c: ./Speck128256SSE4.h:23:29: note: expanded from macro 'R'
stream.c: #define R(X,Y,k) (X=XOR(ADD(ROR8(X),Y),k), Y=XOR(ROL(Y,3),X))
stream.c: ^
stream.c: ./Intrinsics_SSE4_128block.h:41:19: note: expanded from macro 'ROR8'
stream.c: #define ROR8(X) (SHFL(X,R8))
stream.c: ^
stream.c: ./Intrinsics_SSE4_128block.h:37:14: note: expanded from macro 'SHFL'
stream.c: #define SHFL _mm_shuffle_epi8
stream.c: ^
stream.c: stream.c:118:21: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'Encrypt' that is compiled without support for 'ssse3'
stream.c: ./Speck128256SSE4.h:42:41: note: expanded from macro 'Enc'
stream.c: #define Enc(X,Y,k,n) (Rx##n(X,Y,k[0]), Rx##n(X,Y,k[1]), Rx##n(X,Y,k[2]), Rx##n(X,Y,k[3]), Rx##n(X,Y,k[4]), Rx##n(X,Y,k[5]), Rx##n(X,Y,k[6]), Rx##n(X,Y,k[7]), \
stream.c: ^
stream.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse4