Implementation notes: amd64, cel02, crypto_stream/speck128128ctr

Computer: cel02
Architecture: amd64
CPU ID: GenuineIntel-00050657-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_stream
Primitive: speck128128ctr
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
115236728 0 053349 824 888T:avx512gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
128635945 0 049340 816 856T:avx512gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
150835467 0 048492 816 856T:avx512gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
167635260 0 047160 800 824T:avx512gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
200833053 0 046036 816 856T:avx2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
203434933 0 051525 824 888T:avx2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
379432876 0 044776 800 824T:avx2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
429640047 0 052060 792 800T:avx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
464834084 0 047436 816 856T:avx2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
630834407 0 046364 792 800T:sse4clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
647240898 0 057517 824 888T:sse4gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
665027708 0 041076 816 856T:sse4gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
721628065 0 039976 800 824T:sse4gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
797028096 0 041092 816 856T:sse4gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55e9df602fa0: v4i64 = X86ISD::VTRUNC 0x55e9df602e70
try.c: 0x55e9df602e70: v16i32 = vselect 0x55e9df619510, 0x55e9df59f850, 0x55e9df602d40
try.c: 0x55e9df619510: v4i1 = X86ISD::PCMPGTM 0x55e9df5fd970, 0x55e9df5f9500
try.c: 0x55e9df5fd970: v4i64 = X86ISD::VBROADCAST 0x55e9df5c4570
try.c: 0x55e9df5c4570: i64,ch = load<LD8[%lsr.iv6971]> 0x55e9df50e900, 0x55e9df5f04c0, undef:i64
try.c: 0x55e9df5f04c0: i64,ch = CopyFromReg 0x55e9df50e900, Register:i64 %vreg50
try.c: 0x55e9df5f9760: i64 = Register %vreg50
try.c: 0x55e9df59dec0: i64 = undef
try.c: 0x55e9df5f9500: v4i64,ch = CopyFromReg 0x55e9df50e900, Register:v4i64 %vreg13
try.c: 0x55e9df5fe1c0: v4i64 = Register %vreg13
try.c: 0x55e9df59f850: v16i32 = X86ISD::VBROADCAST 0x55e9df5fdbd0
try.c: 0x55e9df5fdbd0: i32,ch = load<LD4[ConstantPool]> 0x55e9df50e900, 0x55e9df5c3b50, undef:i64
try.c: 0x55e9df5c3b50: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55e9df5a3c50: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55e9df59dec0: i64 = undef
try.c: 0x55e9df602d40: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55e9df602c10: i32 = Constant<0>
try.c: 0x55e9df602c10: i32 = Constant<0>
try.c: 0x55e9df602c10: i32 = Constant<0>
try.c: 0x55e9df602c10: i32 = Constant<0>
try.c: 0x55e9df602c10: i32 = Constant<0>
try.c: 0x55e9df602c10: i32 = Constant<0>
try.c: 0x55e9df602c10: i32 = Constant<0>
try.c: 0x55e9df602c10: i32 = Constant<0>
try.c: 0x55e9df602c10: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x562b3c0b75d0: v4i64 = X86ISD::VTRUNC 0x562b3c0b74a0
try.c: 0x562b3c0b74a0: v16i32 = vselect 0x562b3c0bc900, 0x562b3c0391f0, 0x562b3c0b7370
try.c: 0x562b3c0bc900: v4i1 = X86ISD::PCMPGTM 0x562b3c09d7d0, 0x562b3c099360
try.c: 0x562b3c09d7d0: v4i64 = X86ISD::VBROADCAST 0x562b3c0396b0
try.c: 0x562b3c0396b0: i64,ch = load<LD8[%lsr.iv6971]> 0x562b3bf97a30, 0x562b3c032fa0, undef:i64
try.c: 0x562b3c032fa0: i64,ch = CopyFromReg 0x562b3bf97a30, Register:i64 %vreg50
try.c: 0x562b3c0995c0: i64 = Register %vreg50
try.c: 0x562b3c047130: i64 = undef
try.c: 0x562b3c099360: v4i64,ch = CopyFromReg 0x562b3bf97a30, Register:v4i64 %vreg13
try.c: 0x562b3c09e020: v4i64 = Register %vreg13
try.c: 0x562b3c0391f0: v16i32 = X86ISD::VBROADCAST 0x562b3c09da30
try.c: 0x562b3c09da30: i32,ch = load<LD4[ConstantPool]> 0x562b3bf97a30, 0x562b3c030d60, undef:i64
try.c: 0x562b3c030d60: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x562b3c047ab0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x562b3c047130: i64 = undef
try.c: 0x562b3c0b7370: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x562b3c0b7240: i32 = Constant<0>
try.c: 0x562b3c0b7240: i32 = Constant<0>
try.c: 0x562b3c0b7240: i32 = Constant<0>
try.c: 0x562b3c0b7240: i32 = Constant<0>
try.c: 0x562b3c0b7240: i32 = Constant<0>
try.c: 0x562b3c0b7240: i32 = Constant<0>
try.c: 0x562b3c0b7240: i32 = Constant<0>
try.c: 0x562b3c0b7240: i32 = Constant<0>
try.c: 0x562b3c0b7240: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x56349d761d60: v4i64 = X86ISD::VTRUNC 0x56349d761c30
try.c: 0x56349d761c30: v16i32 = vselect 0x56349d75ab20, 0x56349d6ff590, 0x56349d761b00
try.c: 0x56349d75ab20: v4i1 = X86ISD::PCMPGTM 0x56349d757af0, 0x56349d753680
try.c: 0x56349d757af0: v4i64 = X86ISD::VBROADCAST 0x56349d6fb630
try.c: 0x56349d6fb630: i64,ch = load<LD8[%lsr.iv6971]> 0x56349d6689c0, 0x56349d6b91f0, undef:i64
try.c: 0x56349d6b91f0: i64,ch = CopyFromReg 0x56349d6689c0, Register:i64 %vreg50
try.c: 0x56349d7538e0: i64 = Register %vreg50
try.c: 0x56349d6fdc00: i64 = undef
try.c: 0x56349d753680: v4i64,ch = CopyFromReg 0x56349d6689c0, Register:v4i64 %vreg13
try.c: 0x56349d758340: v4i64 = Register %vreg13
try.c: 0x56349d6ff590: v16i32 = X86ISD::VBROADCAST 0x56349d757d50
try.c: 0x56349d757d50: i32,ch = load<LD4[ConstantPool]> 0x56349d6689c0, 0x56349d6fac10, undef:i64
try.c: 0x56349d6fac10: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x56349d741710: i64 = TargetConstantPool<i32 1> 0
try.c: 0x56349d6fdc00: i64 = undef
try.c: 0x56349d761b00: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x56349d7619d0: i32 = Constant<0>
try.c: 0x56349d7619d0: i32 = Constant<0>
try.c: 0x56349d7619d0: i32 = Constant<0>
try.c: 0x56349d7619d0: i32 = Constant<0>
try.c: 0x56349d7619d0: i32 = Constant<0>
try.c: 0x56349d7619d0: i32 = Constant<0>
try.c: 0x56349d7619d0: i32 = Constant<0>
try.c: 0x56349d7619d0: i32 = Constant<0>
try.c: 0x56349d7619d0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
stream.c: stream.c:288:3: error: always_inline function '_mm256_set_epi64x' requires target feature 'sse4.2', but would be inlined into function 'ExpandKey' that is compiled without support for 'sse4.2'
stream.c: EK(A,B,rk,key);
stream.c: ^
stream.c: ./Speck128128AVX2.h:54:24: note: expanded from macro 'EK'
stream.c: #define EK(A,B,k,key) (RK(B,A,k,key,0), RK(B,A,k,key,1), RK(B,A,k,key,2), RK(B,A,k,key,3), RK(B,A,k,key,4), RK(B,A,k,key,5), RK(B,A,k,key,6), \
stream.c: ^
stream.c: ./Speck128128AVX2.h:52:28: note: expanded from macro 'RK'
stream.c: #define RK(X,Y,k,key,i) (SET1(k[i],Y), key[i]=Y, X=RCS(X,8), X+=Y, X^=i, Y=LCS(Y,3), Y^=X)
stream.c: ^
stream.c: ./Intrinsics_AVX2_128block.h:25:22: note: expanded from macro 'SET1'
stream.c: #define SET1(X,c) (X=SET(c,c,c,c))
stream.c: ^
stream.c: ./Intrinsics_AVX2_128block.h:24:13: note: expanded from macro 'SET'
stream.c: #define SET _mm256_set_epi64x
stream.c: ^
stream.c: stream.c:288:3: error: always_inline function '_mm256_set_epi64x' requires target feature 'sse4.2', but would be inlined into function 'ExpandKey' that is compiled without support for 'sse4.2'
stream.c: ./Speck128128AVX2.h:54:42: note: expanded from macro 'EK'
stream.c: #define EK(A,B,k,key) (RK(B,A,k,key,0), RK(B,A,k,key,1), RK(B,A,k,key,2), RK(B,A,k,key,3), RK(B,A,k,key,4), RK(B,A,k,key,5), RK(B,A,k,key,6), \
stream.c: ^
stream.c: ./Speck128128AVX2.h:52:28: note: expanded from macro 'RK'
stream.c: #define RK(X,Y,k,key,i) (SET1(k[i],Y), key[i]=Y, X=RCS(X,8), X+=Y, X^=i, Y=LCS(Y,3), Y^=X)
stream.c: ^
stream.c: ./Intrinsics_AVX2_128block.h:25:22: note: expanded from macro 'SET1'
stream.c: #define SET1(X,c) (X=SET(c,c,c,c))
stream.c: ^
stream.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx512
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
stream.c: stream.c:135:5: warning: implicit declaration of function '_mm512_set_epi64' is invalid in C99 [-Wimplicit-function-declaration]
stream.c: SET1(X[0],nonce[1]); SET8(Y[0],nonce[0]);
stream.c: ^
stream.c: ./Intrinsics_AVX512_128block.h:25:22: note: expanded from macro 'SET1'
stream.c: #define SET1(X,c) (X=SET(c,c,c,c,c,c,c,c))
stream.c: ^
stream.c: ./Intrinsics_AVX512_128block.h:16:13: note: expanded from macro 'SET'
stream.c: #define SET _mm512_set_epi64
stream.c: ^
stream.c: stream.c:135:5: error: assigning to '__m512i' (vector of 8 'long long' values) from incompatible type 'int'
stream.c: SET1(X[0],nonce[1]); SET8(Y[0],nonce[0]);
stream.c: ^~~~~~~~~~~~~~~~~~~
stream.c: ./Intrinsics_AVX512_128block.h:25:21: note: expanded from macro 'SET1'
stream.c: #define SET1(X,c) (X=SET(c,c,c,c,c,c,c,c))
stream.c: ^~~~~~~~~~~~~~~~~~~~~
stream.c: stream.c:135:26: error: assigning to '__m512i' (vector of 8 'long long' values) from incompatible type 'int'
stream.c: SET1(X[0],nonce[1]); SET8(Y[0],nonce[0]);
stream.c: ^~~~~~~~~~~~~~~~~~~
stream.c: ./Intrinsics_AVX512_128block.h:26:21: note: expanded from macro 'SET8'
stream.c: #define SET8(X,c) (X=SET(c,c,c,c,c,c,c,c), X=ADD(X,_q8))
stream.c: ^~~~~~~~~~~~~~~~~~~~~
stream.c: stream.c:135:26: error: passing 'int' to parameter of incompatible type '__m512i' (vector of 8 'long long' values)
stream.c: SET1(X[0],nonce[1]); SET8(Y[0],nonce[0]);
stream.c: ^~~~~~~~~~~~~~~~~~~
stream.c: ./Intrinsics_AVX512_128block.h:26:52: note: expanded from macro 'SET8'
stream.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512

Compiler output

Implementation: T:sse4
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x555d94af35f0: v4i64 = X86ISD::VTRUNC 0x555d94af34c0
try.c: 0x555d94af34c0: v16i32 = vselect 0x555d94aedfc0, 0x555d94a99050, 0x555d94af3390
try.c: 0x555d94aedfc0: v4i1 = X86ISD::PCMPGTM 0x555d94aecfb0, 0x555d94ae8b40
try.c: 0x555d94aecfb0: v4i64 = X86ISD::VBROADCAST 0x555d94a8bfd0
try.c: 0x555d94a8bfd0: i64,ch = load<LD8[%lsr.iv6971]> 0x555d949fd950, 0x555d94ae06f0, undef:i64
try.c: 0x555d94ae06f0: i64,ch = CopyFromReg 0x555d949fd950, Register:i64 %vreg50
try.c: 0x555d94ae8da0: i64 = Register %vreg50
try.c: 0x555d94a976c0: i64 = undef
try.c: 0x555d94ae8b40: v4i64,ch = CopyFromReg 0x555d949fd950, Register:v4i64 %vreg13
try.c: 0x555d94aed800: v4i64 = Register %vreg13
try.c: 0x555d94a99050: v16i32 = X86ISD::VBROADCAST 0x555d94aed210
try.c: 0x555d94aed210: i32,ch = load<LD4[ConstantPool]> 0x555d949fd950, 0x555d94a8b5b0, undef:i64
try.c: 0x555d94a8b5b0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x555d94a4dd90: i64 = TargetConstantPool<i32 1> 0
try.c: 0x555d94a976c0: i64 = undef
try.c: 0x555d94af3390: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x555d94af3260: i32 = Constant<0>
try.c: 0x555d94af3260: i32 = Constant<0>
try.c: 0x555d94af3260: i32 = Constant<0>
try.c: 0x555d94af3260: i32 = Constant<0>
try.c: 0x555d94af3260: i32 = Constant<0>
try.c: 0x555d94af3260: i32 = Constant<0>
try.c: 0x555d94af3260: i32 = Constant<0>
try.c: 0x555d94af3260: i32 = Constant<0>
try.c: 0x555d94af3260: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse4

Compiler output

Implementation: T:sse4
Security model: timingleaks
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x558d93fb7f00: v4i64 = X86ISD::VTRUNC 0x558d93fb7dd0
try.c: 0x558d93fb7dd0: v16i32 = vselect 0x558d93fa7d30, 0x558d93f37240, 0x558d93fb7ca0
try.c: 0x558d93fa7d30: v4i1 = X86ISD::PCMPGTM 0x558d93f9f960, 0x558d93f9b4f0
try.c: 0x558d93f9f960: v4i64 = X86ISD::VBROADCAST 0x558d93f37700
try.c: 0x558d93f37700: i64,ch = load<LD8[%lsr.iv6971]> 0x558d93e98a30, 0x558d93f484d0, undef:i64
try.c: 0x558d93f484d0: i64,ch = CopyFromReg 0x558d93e98a30, Register:i64 %vreg50
try.c: 0x558d93f9b750: i64 = Register %vreg50
try.c: 0x558d93f33090: i64 = undef
try.c: 0x558d93f9b4f0: v4i64,ch = CopyFromReg 0x558d93e98a30, Register:v4i64 %vreg13
try.c: 0x558d93fa01b0: v4i64 = Register %vreg13
try.c: 0x558d93f37240: v16i32 = X86ISD::VBROADCAST 0x558d93f9fbc0
try.c: 0x558d93f9fbc0: i32,ch = load<LD4[ConstantPool]> 0x558d93e98a30, 0x558d93f4d8d0, undef:i64
try.c: 0x558d93f4d8d0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x558d93f33a10: i64 = TargetConstantPool<i32 1> 0
try.c: 0x558d93f33090: i64 = undef
try.c: 0x558d93fb7ca0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x558d93fb7b70: i32 = Constant<0>
try.c: 0x558d93fb7b70: i32 = Constant<0>
try.c: 0x558d93fb7b70: i32 = Constant<0>
try.c: 0x558d93fb7b70: i32 = Constant<0>
try.c: 0x558d93fb7b70: i32 = Constant<0>
try.c: 0x558d93fb7b70: i32 = Constant<0>
try.c: 0x558d93fb7b70: i32 = Constant<0>
try.c: 0x558d93fb7b70: i32 = Constant<0>
try.c: 0x558d93fb7b70: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse4

Compiler output

Implementation: T:sse4
Security model: timingleaks
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x564473177220: v4i64 = X86ISD::VTRUNC 0x5644731770f0
try.c: 0x5644731770f0: v16i32 = vselect 0x5644731629c0, 0x5644730fbb80, 0x564473176fc0
try.c: 0x5644731629c0: v4i1 = X86ISD::PCMPGTM 0x56447315b890, 0x564473157420
try.c: 0x56447315b890: v4i64 = X86ISD::VBROADCAST 0x5644730ff910
try.c: 0x5644730ff910: i64,ch = load<LD8[%lsr.iv6971]> 0x56447306c9d0, 0x56447314e750, undef:i64
try.c: 0x56447314e750: i64,ch = CopyFromReg 0x56447306c9d0, Register:i64 %vreg50
try.c: 0x564473157680: i64 = Register %vreg50
try.c: 0x5644730fa1f0: i64 = undef
try.c: 0x564473157420: v4i64,ch = CopyFromReg 0x56447306c9d0, Register:v4i64 %vreg13
try.c: 0x56447315c0e0: v4i64 = Register %vreg13
try.c: 0x5644730fbb80: v16i32 = X86ISD::VBROADCAST 0x56447315baf0
try.c: 0x56447315baf0: i32,ch = load<LD4[ConstantPool]> 0x56447306c9d0, 0x5644730feef0, undef:i64
try.c: 0x5644730feef0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x564473145410: i64 = TargetConstantPool<i32 1> 0
try.c: 0x5644730fa1f0: i64 = undef
try.c: 0x564473176fc0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x564473176e90: i32 = Constant<0>
try.c: 0x564473176e90: i32 = Constant<0>
try.c: 0x564473176e90: i32 = Constant<0>
try.c: 0x564473176e90: i32 = Constant<0>
try.c: 0x564473176e90: i32 = Constant<0>
try.c: 0x564473176e90: i32 = Constant<0>
try.c: 0x564473176e90: i32 = Constant<0>
try.c: 0x564473176e90: i32 = Constant<0>
try.c: 0x564473176e90: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse4

Compiler output

Implementation: T:sse4
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
stream.c: stream.c:114:21: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'Encrypt' that is compiled without support for 'ssse3'
stream.c: if (numbytes==32) Enc(X,Y,rk,2);
stream.c: ^
stream.c: ./Speck128128SSE4.h:42:23: note: expanded from macro 'Enc'
stream.c: #define Enc(X,Y,k,n) (Rx##n(X,Y,k[0]), Rx##n(X,Y,k[1]), Rx##n(X,Y,k[2]), Rx##n(X,Y,k[3]), Rx##n(X,Y,k[4]), Rx##n(X,Y,k[5]), Rx##n(X,Y,k[6]), Rx##n(X,Y,k[7]), \
stream.c: ^
stream.c: <scratch space>:72:1: note: expanded from here
stream.c: Rx2
stream.c: ^
stream.c: ./Speck128128SSE4.h:25:21: note: expanded from macro 'Rx2'
stream.c: #define Rx2(X,Y,k) (R(X[0],Y[0],k))
stream.c: ^
stream.c: ./Speck128128SSE4.h:23:29: note: expanded from macro 'R'
stream.c: #define R(X,Y,k) (X=XOR(ADD(ROR8(X),Y),k), Y=XOR(ROL(Y,3),X))
stream.c: ^
stream.c: ./Intrinsics_SSE4_128block.h:41:19: note: expanded from macro 'ROR8'
stream.c: #define ROR8(X) (SHFL(X,R8))
stream.c: ^
stream.c: ./Intrinsics_SSE4_128block.h:37:14: note: expanded from macro 'SHFL'
stream.c: #define SHFL _mm_shuffle_epi8
stream.c: ^
stream.c: stream.c:114:21: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'Encrypt' that is compiled without support for 'ssse3'
stream.c: ./Speck128128SSE4.h:42:41: note: expanded from macro 'Enc'
stream.c: #define Enc(X,Y,k,n) (Rx##n(X,Y,k[0]), Rx##n(X,Y,k[1]), Rx##n(X,Y,k[2]), Rx##n(X,Y,k[3]), Rx##n(X,Y,k[4]), Rx##n(X,Y,k[5]), Rx##n(X,Y,k[6]), Rx##n(X,Y,k[7]), \
stream.c: ^
stream.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse4