Implementation notes: amd64, cel02, crypto_stream/speck128192ctr

Computer: cel02
Architecture: amd64
CPU ID: GenuineIntel-00050657-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_stream
Primitive: speck128192ctr
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
127837969 0 054613 824 888T:avx512gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
136037157 0 050572 816 856T:avx512gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
165036738 0 049772 816 856T:avx512gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
166036445 0 048352 800 824T:avx512gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
205833962 0 045864 800 824T:avx2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
210436059 0 052661 824 888T:avx2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
333441252 0 053292 792 800T:avx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
351834153 0 047156 816 856T:avx2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
363035211 0 048588 816 856T:avx2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
409628682 0 042084 816 856T:sse4gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
636842270 0 058893 824 888T:sse4gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
636829118 0 041032 800 824T:sse4gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
738429068 0 042084 816 856T:sse4gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
813635463 0 047436 792 800T:sse4clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55f38e0f6e90: v4i64 = X86ISD::VTRUNC 0x55f38e0f6d60
try.c: 0x55f38e0f6d60: v16i32 = vselect 0x55f38e1093e0, 0x55f38e08ed90, 0x55f38e0f6c30
try.c: 0x55f38e1093e0: v4i1 = X86ISD::PCMPGTM 0x55f38e0ef850, 0x55f38e0eb3e0
try.c: 0x55f38e0ef850: v4i64 = X86ISD::VBROADCAST 0x55f38e0a90b0
try.c: 0x55f38e0a90b0: i64,ch = load<LD8[%lsr.iv6971]> 0x55f38e000940, 0x55f38e0e6240, undef:i64
try.c: 0x55f38e0e6240: i64,ch = CopyFromReg 0x55f38e000940, Register:i64 %vreg50
try.c: 0x55f38e0eb640: i64 = Register %vreg50
try.c: 0x55f38e0aa580: i64 = undef
try.c: 0x55f38e0eb3e0: v4i64,ch = CopyFromReg 0x55f38e000940, Register:v4i64 %vreg13
try.c: 0x55f38e0f00a0: v4i64 = Register %vreg13
try.c: 0x55f38e08ed90: v16i32 = X86ISD::VBROADCAST 0x55f38e0efab0
try.c: 0x55f38e0efab0: i32,ch = load<LD4[ConstantPool]> 0x55f38e000940, 0x55f38e0a8690, undef:i64
try.c: 0x55f38e0a8690: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55f38e050de0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55f38e0aa580: i64 = undef
try.c: 0x55f38e0f6c30: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55f38e0f6b00: i32 = Constant<0>
try.c: 0x55f38e0f6b00: i32 = Constant<0>
try.c: 0x55f38e0f6b00: i32 = Constant<0>
try.c: 0x55f38e0f6b00: i32 = Constant<0>
try.c: 0x55f38e0f6b00: i32 = Constant<0>
try.c: 0x55f38e0f6b00: i32 = Constant<0>
try.c: 0x55f38e0f6b00: i32 = Constant<0>
try.c: 0x55f38e0f6b00: i32 = Constant<0>
try.c: 0x55f38e0f6b00: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x5577db923340: v4i64 = X86ISD::VTRUNC 0x5577db923210
try.c: 0x5577db923210: v16i32 = vselect 0x5577db9127f0, 0x5577db8b7650, 0x5577db9230e0
try.c: 0x5577db9127f0: v4i1 = X86ISD::PCMPGTM 0x5577db8fe050, 0x5577db8f9be0
try.c: 0x5577db8fe050: v4i64 = X86ISD::VBROADCAST 0x5577db8b7b10
try.c: 0x5577db8b7b10: i64,ch = load<LD8[%lsr.iv6971]> 0x5577db7f7a30, 0x5577db8abbd0, undef:i64
try.c: 0x5577db8abbd0: i64,ch = CopyFromReg 0x5577db7f7a30, Register:i64 %vreg50
try.c: 0x5577db8f9e40: i64 = Register %vreg50
try.c: 0x5577db894e90: i64 = undef
try.c: 0x5577db8f9be0: v4i64,ch = CopyFromReg 0x5577db7f7a30, Register:v4i64 %vreg13
try.c: 0x5577db8fe8a0: v4i64 = Register %vreg13
try.c: 0x5577db8b7650: v16i32 = X86ISD::VBROADCAST 0x5577db8fe2b0
try.c: 0x5577db8fe2b0: i32,ch = load<LD4[ConstantPool]> 0x5577db7f7a30, 0x5577db8a71e0, undef:i64
try.c: 0x5577db8a71e0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x5577db895810: i64 = TargetConstantPool<i32 1> 0
try.c: 0x5577db894e90: i64 = undef
try.c: 0x5577db9230e0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x5577db922fb0: i32 = Constant<0>
try.c: 0x5577db922fb0: i32 = Constant<0>
try.c: 0x5577db922fb0: i32 = Constant<0>
try.c: 0x5577db922fb0: i32 = Constant<0>
try.c: 0x5577db922fb0: i32 = Constant<0>
try.c: 0x5577db922fb0: i32 = Constant<0>
try.c: 0x5577db922fb0: i32 = Constant<0>
try.c: 0x5577db922fb0: i32 = Constant<0>
try.c: 0x5577db922fb0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55aae0748df0: v4i64 = X86ISD::VTRUNC 0x55aae0748cc0
try.c: 0x55aae0748cc0: v16i32 = vselect 0x55aae0743d50, 0x55aae06d20f0, 0x55aae0748b90
try.c: 0x55aae0743d50: v4i1 = X86ISD::PCMPGTM 0x55aae07299b0, 0x55aae0725540
try.c: 0x55aae07299b0: v4i64 = X86ISD::VBROADCAST 0x55aae06e4f90
try.c: 0x55aae06e4f90: i64,ch = load<LD8[%lsr.iv6971]> 0x55aae063a950, 0x55aae0713930, undef:i64
try.c: 0x55aae0713930: i64,ch = CopyFromReg 0x55aae063a950, Register:i64 %vreg50
try.c: 0x55aae07257a0: i64 = Register %vreg50
try.c: 0x55aae06d0760: i64 = undef
try.c: 0x55aae0725540: v4i64,ch = CopyFromReg 0x55aae063a950, Register:v4i64 %vreg13
try.c: 0x55aae072a200: v4i64 = Register %vreg13
try.c: 0x55aae06d20f0: v16i32 = X86ISD::VBROADCAST 0x55aae0729c10
try.c: 0x55aae0729c10: i32,ch = load<LD4[ConstantPool]> 0x55aae063a950, 0x55aae06e4570, undef:i64
try.c: 0x55aae06e4570: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55aae0714b30: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55aae06d0760: i64 = undef
try.c: 0x55aae0748b90: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55aae0748a60: i32 = Constant<0>
try.c: 0x55aae0748a60: i32 = Constant<0>
try.c: 0x55aae0748a60: i32 = Constant<0>
try.c: 0x55aae0748a60: i32 = Constant<0>
try.c: 0x55aae0748a60: i32 = Constant<0>
try.c: 0x55aae0748a60: i32 = Constant<0>
try.c: 0x55aae0748a60: i32 = Constant<0>
try.c: 0x55aae0748a60: i32 = Constant<0>
try.c: 0x55aae0748a60: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
stream.c: stream.c:292:3: error: always_inline function '_mm256_set_epi64x' requires target feature 'sse4.2', but would be inlined into function 'ExpandKey' that is compiled without support for 'sse4.2'
stream.c: EK(A,B,C,rk,key);
stream.c: ^
stream.c: ./Speck128192AVX2.h:56:26: note: expanded from macro 'EK'
stream.c: #define EK(A,B,C,k,key) (RK(B,A,k,key,0), RK(C,A,k,key,1), RK(B,A,k,key,2), RK(C,A,k,key,3), RK(B,A,k,key,4), RK(C,A,k,key,5), RK(B,A,k,key,6), \
stream.c: ^
stream.c: ./Speck128192AVX2.h:54:28: note: expanded from macro 'RK'
stream.c: #define RK(X,Y,k,key,i) (SET1(k[i],Y), key[i]=Y, X=RCS(X,8), X+=Y, X^=i, Y=LCS(Y,3), Y^=X)
stream.c: ^
stream.c: ./Intrinsics_AVX2_128block.h:25:22: note: expanded from macro 'SET1'
stream.c: #define SET1(X,c) (X=SET(c,c,c,c))
stream.c: ^
stream.c: ./Intrinsics_AVX2_128block.h:24:13: note: expanded from macro 'SET'
stream.c: #define SET _mm256_set_epi64x
stream.c: ^
stream.c: stream.c:292:3: error: always_inline function '_mm256_set_epi64x' requires target feature 'sse4.2', but would be inlined into function 'ExpandKey' that is compiled without support for 'sse4.2'
stream.c: ./Speck128192AVX2.h:56:44: note: expanded from macro 'EK'
stream.c: #define EK(A,B,C,k,key) (RK(B,A,k,key,0), RK(C,A,k,key,1), RK(B,A,k,key,2), RK(C,A,k,key,3), RK(B,A,k,key,4), RK(C,A,k,key,5), RK(B,A,k,key,6), \
stream.c: ^
stream.c: ./Speck128192AVX2.h:54:28: note: expanded from macro 'RK'
stream.c: #define RK(X,Y,k,key,i) (SET1(k[i],Y), key[i]=Y, X=RCS(X,8), X+=Y, X^=i, Y=LCS(Y,3), Y^=X)
stream.c: ^
stream.c: ./Intrinsics_AVX2_128block.h:25:22: note: expanded from macro 'SET1'
stream.c: #define SET1(X,c) (X=SET(c,c,c,c))
stream.c: ^
stream.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx512
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
stream.c: stream.c:137:5: warning: implicit declaration of function '_mm512_set_epi64' is invalid in C99 [-Wimplicit-function-declaration]
stream.c: SET1(X[0],nonce[1]); SET8(Y[0],nonce[0]);
stream.c: ^
stream.c: ./Intrinsics_AVX512_128block.h:25:22: note: expanded from macro 'SET1'
stream.c: #define SET1(X,c) (X=SET(c,c,c,c,c,c,c,c))
stream.c: ^
stream.c: ./Intrinsics_AVX512_128block.h:16:13: note: expanded from macro 'SET'
stream.c: #define SET _mm512_set_epi64
stream.c: ^
stream.c: stream.c:137:5: error: assigning to '__m512i' (vector of 8 'long long' values) from incompatible type 'int'
stream.c: SET1(X[0],nonce[1]); SET8(Y[0],nonce[0]);
stream.c: ^~~~~~~~~~~~~~~~~~~
stream.c: ./Intrinsics_AVX512_128block.h:25:21: note: expanded from macro 'SET1'
stream.c: #define SET1(X,c) (X=SET(c,c,c,c,c,c,c,c))
stream.c: ^~~~~~~~~~~~~~~~~~~~~
stream.c: stream.c:137:26: error: assigning to '__m512i' (vector of 8 'long long' values) from incompatible type 'int'
stream.c: SET1(X[0],nonce[1]); SET8(Y[0],nonce[0]);
stream.c: ^~~~~~~~~~~~~~~~~~~
stream.c: ./Intrinsics_AVX512_128block.h:26:21: note: expanded from macro 'SET8'
stream.c: #define SET8(X,c) (X=SET(c,c,c,c,c,c,c,c), X=ADD(X,_q8))
stream.c: ^~~~~~~~~~~~~~~~~~~~~
stream.c: stream.c:137:26: error: passing 'int' to parameter of incompatible type '__m512i' (vector of 8 'long long' values)
stream.c: SET1(X[0],nonce[1]); SET8(Y[0],nonce[0]);
stream.c: ^~~~~~~~~~~~~~~~~~~
stream.c: ./Intrinsics_AVX512_128block.h:26:52: note: expanded from macro 'SET8'
stream.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512

Compiler output

Implementation: T:sse4
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55883b1cd390: v4i64 = X86ISD::VTRUNC 0x55883b1cd260
try.c: 0x55883b1cd260: v16i32 = vselect 0x55883b1b9590, 0x55883b15a350, 0x55883b1cd130
try.c: 0x55883b1b9590: v4i1 = X86ISD::PCMPGTM 0x55883b1b2a10, 0x55883b1ae5a0
try.c: 0x55883b1b2a10: v4i64 = X86ISD::VBROADCAST 0x55883b15cbd0
try.c: 0x55883b15cbd0: i64,ch = load<LD8[%lsr.iv6971]> 0x55883b0c3950, 0x55883b197e40, undef:i64
try.c: 0x55883b197e40: i64,ch = CopyFromReg 0x55883b0c3950, Register:i64 %vreg50
try.c: 0x55883b1ae800: i64 = Register %vreg50
try.c: 0x55883b1589c0: i64 = undef
try.c: 0x55883b1ae5a0: v4i64,ch = CopyFromReg 0x55883b0c3950, Register:v4i64 %vreg13
try.c: 0x55883b1b3260: v4i64 = Register %vreg13
try.c: 0x55883b15a350: v16i32 = X86ISD::VBROADCAST 0x55883b1b2c70
try.c: 0x55883b1b2c70: i32,ch = load<LD4[ConstantPool]> 0x55883b0c3950, 0x55883b15c1b0, undef:i64
try.c: 0x55883b15c1b0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55883b19c990: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55883b1589c0: i64 = undef
try.c: 0x55883b1cd130: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55883b1cd000: i32 = Constant<0>
try.c: 0x55883b1cd000: i32 = Constant<0>
try.c: 0x55883b1cd000: i32 = Constant<0>
try.c: 0x55883b1cd000: i32 = Constant<0>
try.c: 0x55883b1cd000: i32 = Constant<0>
try.c: 0x55883b1cd000: i32 = Constant<0>
try.c: 0x55883b1cd000: i32 = Constant<0>
try.c: 0x55883b1cd000: i32 = Constant<0>
try.c: 0x55883b1cd000: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse4

Compiler output

Implementation: T:sse4
Security model: timingleaks
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55b4b7d58ec0: v4i64 = X86ISD::VTRUNC 0x55b4b7d58d90
try.c: 0x55b4b7d58d90: v16i32 = vselect 0x55b4b7d53890, 0x55b4b7cef940, 0x55b4b7d58c60
try.c: 0x55b4b7d53890: v4i1 = X86ISD::PCMPGTM 0x55b4b7d4e850, 0x55b4b7d4a3e0
try.c: 0x55b4b7d4e850: v4i64 = X86ISD::VBROADCAST 0x55b4b7cefe00
try.c: 0x55b4b7cefe00: i64,ch = load<LD8[%lsr.iv6971]> 0x55b4b7c48a30, 0x55b4b7d0ccb0, undef:i64
try.c: 0x55b4b7d0ccb0: i64,ch = CopyFromReg 0x55b4b7c48a30, Register:i64 %vreg50
try.c: 0x55b4b7d4a640: i64 = Register %vreg50
try.c: 0x55b4b7ce38e0: i64 = undef
try.c: 0x55b4b7d4a3e0: v4i64,ch = CopyFromReg 0x55b4b7c48a30, Register:v4i64 %vreg13
try.c: 0x55b4b7d4f0a0: v4i64 = Register %vreg13
try.c: 0x55b4b7cef940: v16i32 = X86ISD::VBROADCAST 0x55b4b7d4eab0
try.c: 0x55b4b7d4eab0: i32,ch = load<LD4[ConstantPool]> 0x55b4b7c48a30, 0x55b4b7cf22e0, undef:i64
try.c: 0x55b4b7cf22e0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55b4b7ce4260: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55b4b7ce38e0: i64 = undef
try.c: 0x55b4b7d58c60: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55b4b7d58b30: i32 = Constant<0>
try.c: 0x55b4b7d58b30: i32 = Constant<0>
try.c: 0x55b4b7d58b30: i32 = Constant<0>
try.c: 0x55b4b7d58b30: i32 = Constant<0>
try.c: 0x55b4b7d58b30: i32 = Constant<0>
try.c: 0x55b4b7d58b30: i32 = Constant<0>
try.c: 0x55b4b7d58b30: i32 = Constant<0>
try.c: 0x55b4b7d58b30: i32 = Constant<0>
try.c: 0x55b4b7d58b30: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse4

Compiler output

Implementation: T:sse4
Security model: timingleaks
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55bb01da6340: v4i64 = X86ISD::VTRUNC 0x55bb01da6210
try.c: 0x55bb01da6210: v16i32 = vselect 0x55bb01da2d20, 0x55bb01d354e0, 0x55bb01da60e0
try.c: 0x55bb01da2d20: v4i1 = X86ISD::PCMPGTM 0x55bb01d8b9b0, 0x55bb01d87540
try.c: 0x55bb01d8b9b0: v4i64 = X86ISD::VBROADCAST 0x55bb01d32680
try.c: 0x55bb01d32680: i64,ch = load<LD8[%lsr.iv6971]> 0x55bb01c9c960, 0x55bb01d75750, undef:i64
try.c: 0x55bb01d75750: i64,ch = CopyFromReg 0x55bb01c9c960, Register:i64 %vreg50
try.c: 0x55bb01d877a0: i64 = Register %vreg50
try.c: 0x55bb01d33b50: i64 = undef
try.c: 0x55bb01d87540: v4i64,ch = CopyFromReg 0x55bb01c9c960, Register:v4i64 %vreg13
try.c: 0x55bb01d8c200: v4i64 = Register %vreg13
try.c: 0x55bb01d354e0: v16i32 = X86ISD::VBROADCAST 0x55bb01d8bc10
try.c: 0x55bb01d8bc10: i32,ch = load<LD4[ConstantPool]> 0x55bb01c9c960, 0x55bb01d525c0, undef:i64
try.c: 0x55bb01d525c0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55bb01d722b0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55bb01d33b50: i64 = undef
try.c: 0x55bb01da60e0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55bb01da5fb0: i32 = Constant<0>
try.c: 0x55bb01da5fb0: i32 = Constant<0>
try.c: 0x55bb01da5fb0: i32 = Constant<0>
try.c: 0x55bb01da5fb0: i32 = Constant<0>
try.c: 0x55bb01da5fb0: i32 = Constant<0>
try.c: 0x55bb01da5fb0: i32 = Constant<0>
try.c: 0x55bb01da5fb0: i32 = Constant<0>
try.c: 0x55bb01da5fb0: i32 = Constant<0>
try.c: 0x55bb01da5fb0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse4

Compiler output

Implementation: T:sse4
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
stream.c: stream.c:116:21: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'Encrypt' that is compiled without support for 'ssse3'
stream.c: if (numbytes==32) Enc(X,Y,rk,2);
stream.c: ^
stream.c: ./Speck128192SSE4.h:42:23: note: expanded from macro 'Enc'
stream.c: #define Enc(X,Y,k,n) (Rx##n(X,Y,k[0]), Rx##n(X,Y,k[1]), Rx##n(X,Y,k[2]), Rx##n(X,Y,k[3]), Rx##n(X,Y,k[4]), Rx##n(X,Y,k[5]), Rx##n(X,Y,k[6]), Rx##n(X,Y,k[7]), \
stream.c: ^
stream.c: <scratch space>:73:1: note: expanded from here
stream.c: Rx2
stream.c: ^
stream.c: ./Speck128192SSE4.h:25:21: note: expanded from macro 'Rx2'
stream.c: #define Rx2(X,Y,k) (R(X[0],Y[0],k))
stream.c: ^
stream.c: ./Speck128192SSE4.h:23:29: note: expanded from macro 'R'
stream.c: #define R(X,Y,k) (X=XOR(ADD(ROR8(X),Y),k), Y=XOR(ROL(Y,3),X))
stream.c: ^
stream.c: ./Intrinsics_SSE4_128block.h:41:19: note: expanded from macro 'ROR8'
stream.c: #define ROR8(X) (SHFL(X,R8))
stream.c: ^
stream.c: ./Intrinsics_SSE4_128block.h:37:14: note: expanded from macro 'SHFL'
stream.c: #define SHFL _mm_shuffle_epi8
stream.c: ^
stream.c: stream.c:116:21: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'Encrypt' that is compiled without support for 'ssse3'
stream.c: ./Speck128192SSE4.h:42:41: note: expanded from macro 'Enc'
stream.c: #define Enc(X,Y,k,n) (Rx##n(X,Y,k[0]), Rx##n(X,Y,k[1]), Rx##n(X,Y,k[2]), Rx##n(X,Y,k[3]), Rx##n(X,Y,k[4]), Rx##n(X,Y,k[5]), Rx##n(X,Y,k[6]), Rx##n(X,Y,k[7]), \
stream.c: ^
stream.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse4