Implementation notes: amd64, cel02, crypto_aead/hs1sivlov2

Computer: cel02
Architecture: amd64
CPU ID: GenuineIntel-00050657-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_aead
Primitive: hs1sivlov2
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
487624983 0 045972 832 896T:dolbeau/amd64-avx512gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
50908170 0 028820 832 896T:fastergcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
632621583 0 042092 832 896T:dolbeau/amd64-avx2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
633224313 0 048533 840 960T:dolbeau/amd64-avx2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
650626830 0 051045 840 960T:dolbeau/amd64-avx512gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
659411551 0 035925 840 960T:fastergcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
684211969 0 031096 816 896T:dolbeau/amd64-avx2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
693622413 0 043404 832 896T:dolbeau/amd64-avx2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
75968441 0 029572 832 896T:fastergcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
762624050 0 044556 832 896T:dolbeau/amd64-avx512gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
783214041 0 033192 816 896T:dolbeau/amd64-avx512gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
80087916 0 027064 816 896T:fastergcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
83287627 0 026332 808 856T:fasterclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
101149655 0 028164 808 856T:dolbeau/amd64-sseclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
1046221165 0 045333 840 960T:dolbeau/amd64-ssegcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
1050219061 0 039988 832 896T:dolbeau/amd64-ssegcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
119508004 0 027056 816 896T:dolbeau/amd64-ssegcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
1268017815 0 038260 832 896T:dolbeau/amd64-ssegcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
201286467 0 026818 816 856T:refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
341024302 0 022964 808 856T:refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
399565931 0 030293 840 960T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
500805147 0 026220 832 896T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
513863992 0 023080 816 896T:refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
594225443 0 026117 840 896T:refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130

Test failure

Implementation: T:faster
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
error 111

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:faster

Compiler output

Implementation: T:dolbeau/amd64-avx2
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
encrypt.c: In file included from encrypt.c:234:
encrypt.c: ./u16.h:179:28: warning: implicit declaration of function '_mm512_set_epi32' is invalid in C99 [-Wimplicit-function-declaration]
encrypt.c: const __m512i addv12 = _mm512_set_epi32(15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0);
encrypt.c: ^
encrypt.c: ./u16.h:179:19: error: initializing 'const __m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
encrypt.c: const __m512i addv12 = _mm512_set_epi32(15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0);
encrypt.c: ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
encrypt.c: ./u16.h:181:12: warning: implicit declaration of function '_mm512_broadcastd_epi32' is invalid in C99 [-Wimplicit-function-declaration]
encrypt.c: t_12 = _mm512_broadcastd_epi32(_mm_cvtsi32_si128(in12));
encrypt.c: ^
encrypt.c: ./u16.h:181:10: error: assigning to '__m512i' (vector of 8 'long long' values) from incompatible type 'int'
encrypt.c: t_12 = _mm512_broadcastd_epi32(_mm_cvtsi32_si128(in12));
encrypt.c: ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
encrypt.c: ./u16.h:187:7: warning: implicit declaration of function '_mm512_rol_epi32' is invalid in C99 [-Wimplicit-function-declaration]
encrypt.c: VEC16_ROUND( 0, 4, 8,12, 1, 5, 9,13, 2, 6,10,14, 3, 7,11,15);
encrypt.c: ^
encrypt.c: ./u16.h:105:70: note: expanded from macro 'VEC16_ROUND'
encrypt.c: #define VEC16_ROUND(a1,b1,c1,d1,a2,b2,c2,d2,a3,b3,c3,d3,a4,b4,c4,d4) VEC16_ROUND_SEQ(a1,b1,c1,d1,a2,b2,c2,d2,a3,b3,c3,d3,a4,b4,c4,d4)
encrypt.c: ^
encrypt.c: ./u16.h:81:3: note: expanded from macro 'VEC16_ROUND_SEQ'
encrypt.c: VEC16_LINE1(a1,b1,c1,d1); \
encrypt.c: ^
encrypt.c: ./u16.h:11:51: note: expanded from macro 'VEC16_LINE1'
encrypt.c: x_##a = _mm512_add_epi32(x_##a, x_##b); x_##d = VEC16_ROT(_mm512_xor_si512(x_##d, x_##a), 16)
encrypt.c: ^
encrypt.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx2
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx512
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx512
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx512
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx512

Compiler output

Implementation: T:dolbeau/amd64-avx2
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
encrypt.c: encrypt.c:90:2: error: "This code requires AVX2 to work"
encrypt.c: #error "This code requires AVX2 to work"
encrypt.c: ^
encrypt.c: 1 error generated.

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx2

Compiler output

Implementation: T:dolbeau/amd64-avx512
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
encrypt.c: encrypt.c:90:2: error: "This code requires AVX512F to work"
encrypt.c: #error "This code requires AVX512F to work"
encrypt.c: ^
encrypt.c: encrypt.c:330:15: error: invalid input constraint 'Yz' in asm
encrypt.c: : [a] "Yz" (a)
encrypt.c: ^
encrypt.c: encrypt.c:461:26: warning: implicit declaration of function '_mm512_loadu_si512' is invalid in C99 [-Wimplicit-function-declaration]
encrypt.c: __m512i kv0 = _mm512_loadu_si512((const __m512i*)(nhkey+ 0)); // 1
encrypt.c: ^
encrypt.c: encrypt.c:461:19: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
encrypt.c: __m512i kv0 = _mm512_loadu_si512((const __m512i*)(nhkey+ 0)); // 1
encrypt.c: ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
encrypt.c: encrypt.c:462:19: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
encrypt.c: __m512i kv4 = _mm512_loadu_si512((const __m512i*)(nhkey+ 4)); // 1
encrypt.c: ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
encrypt.c: encrypt.c:463:19: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
encrypt.c: __m512i kv8 = _mm512_loadu_si512((const __m512i*)(nhkey+ 8)); // 1
encrypt.c: ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
encrypt.c: encrypt.c:464:19: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
encrypt.c: __m512i kv12 = _mm512_loadu_si512((const __m512i*)(nhkey+12)); // 1
encrypt.c: ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
encrypt.c: encrypt.c:466:19: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
encrypt.c: __m512i inv = _mm512_loadu_si512((const __m512i*)(in+ 0));
encrypt.c: ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
encrypt.c: encrypt.c:468:11: warning: implicit declaration of function '_mm512_unpacklo_epi32' is invalid in C99 [-Wimplicit-function-declaration]
encrypt.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx512

Compiler output

Implementation: T:dolbeau/amd64-sse
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55ba4e25d8e0: v4i64 = X86ISD::VTRUNC 0x55ba4e25d7b0
try.c: 0x55ba4e25d7b0: v16i32 = vselect 0x55ba4e247b30, 0x55ba4e1f5770, 0x55ba4e25d680
try.c: 0x55ba4e247b30: v4i1 = X86ISD::PCMPGTM 0x55ba4e23ce90, 0x55ba4e238a20
try.c: 0x55ba4e23ce90: v4i64 = X86ISD::VBROADCAST 0x55ba4e1dbdc0
try.c: 0x55ba4e1dbdc0: i64,ch = load<LD8[%lsr.iv6971]> 0x55ba4e14d960, 0x55ba4e223640, undef:i64
try.c: 0x55ba4e223640: i64,ch = CopyFromReg 0x55ba4e14d960, Register:i64 %vreg50
try.c: 0x55ba4e238c80: i64 = Register %vreg50
try.c: 0x55ba4e1dd290: i64 = undef
try.c: 0x55ba4e238a20: v4i64,ch = CopyFromReg 0x55ba4e14d960, Register:v4i64 %vreg13
try.c: 0x55ba4e23d6e0: v4i64 = Register %vreg13
try.c: 0x55ba4e1f5770: v16i32 = X86ISD::VBROADCAST 0x55ba4e23d0f0
try.c: 0x55ba4e23d0f0: i32,ch = load<LD4[ConstantPool]> 0x55ba4e14d960, 0x55ba4e1db3a0, undef:i64
try.c: 0x55ba4e1db3a0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55ba4e2349d0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55ba4e1dd290: i64 = undef
try.c: 0x55ba4e25d680: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55ba4e25d550: i32 = Constant<0>
try.c: 0x55ba4e25d550: i32 = Constant<0>
try.c: 0x55ba4e25d550: i32 = Constant<0>
try.c: 0x55ba4e25d550: i32 = Constant<0>
try.c: 0x55ba4e25d550: i32 = Constant<0>
try.c: 0x55ba4e25d550: i32 = Constant<0>
try.c: 0x55ba4e25d550: i32 = Constant<0>
try.c: 0x55ba4e25d550: i32 = Constant<0>
try.c: 0x55ba4e25d550: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-sse

Compiler output

Implementation: T:dolbeau/amd64-sse
Security model: timingleaks
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55d53d2d6870: v4i64 = X86ISD::VTRUNC 0x55d53d2d6740
try.c: 0x55d53d2d6740: v16i32 = vselect 0x55d53d2be2f0, 0x55d53d24db10, 0x55d53d2d6610
try.c: 0x55d53d2be2f0: v4i1 = X86ISD::PCMPGTM 0x55d53d2b1e70, 0x55d53d2ae400
try.c: 0x55d53d2b1e70: v4i64 = X86ISD::VBROADCAST 0x55d53d24dfd0
try.c: 0x55d53d24dfd0: i64,ch = load<LD8[%lsr.iv6971]> 0x55d53d1aba30, 0x55d53d246820, undef:i64
try.c: 0x55d53d246820: i64,ch = CopyFromReg 0x55d53d1aba30, Register:i64 %vreg50
try.c: 0x55d53d2ae660: i64 = Register %vreg50
try.c: 0x55d53d262290: i64 = undef
try.c: 0x55d53d2ae400: v4i64,ch = CopyFromReg 0x55d53d1aba30, Register:v4i64 %vreg13
try.c: 0x55d53d2b26c0: v4i64 = Register %vreg13
try.c: 0x55d53d24db10: v16i32 = X86ISD::VBROADCAST 0x55d53d2b20d0
try.c: 0x55d53d2b20d0: i32,ch = load<LD4[ConstantPool]> 0x55d53d1aba30, 0x55d53d244df0, undef:i64
try.c: 0x55d53d244df0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55d53d262c10: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55d53d262290: i64 = undef
try.c: 0x55d53d2d6610: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55d53d2d64e0: i32 = Constant<0>
try.c: 0x55d53d2d64e0: i32 = Constant<0>
try.c: 0x55d53d2d64e0: i32 = Constant<0>
try.c: 0x55d53d2d64e0: i32 = Constant<0>
try.c: 0x55d53d2d64e0: i32 = Constant<0>
try.c: 0x55d53d2d64e0: i32 = Constant<0>
try.c: 0x55d53d2d64e0: i32 = Constant<0>
try.c: 0x55d53d2d64e0: i32 = Constant<0>
try.c: 0x55d53d2d64e0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-sse

Compiler output

Implementation: T:dolbeau/amd64-sse
Security model: timingleaks
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55aaa4ae5930: v4i64 = X86ISD::VTRUNC 0x55aaa4ae5800
try.c: 0x55aaa4ae5800: v16i32 = vselect 0x55aaa4a960e0, 0x55aaa4a71510, 0x55aaa4ae56d0
try.c: 0x55aaa4a960e0: v4i1 = X86ISD::PCMPGTM 0x55aaa4acbfc0, 0x55aaa4ac7b50
try.c: 0x55aaa4acbfc0: v4i64 = X86ISD::VBROADCAST 0x55aaa4a6e6b0
try.c: 0x55aaa4a6e6b0: i64,ch = load<LD8[%lsr.iv6971]> 0x55aaa49dc9a0, 0x55aaa4abf320, undef:i64
try.c: 0x55aaa4abf320: i64,ch = CopyFromReg 0x55aaa49dc9a0, Register:i64 %vreg50
try.c: 0x55aaa4ac7db0: i64 = Register %vreg50
try.c: 0x55aaa4a6fb80: i64 = undef
try.c: 0x55aaa4ac7b50: v4i64,ch = CopyFromReg 0x55aaa49dc9a0, Register:v4i64 %vreg13
try.c: 0x55aaa4acc810: v4i64 = Register %vreg13
try.c: 0x55aaa4a71510: v16i32 = X86ISD::VBROADCAST 0x55aaa4acc220
try.c: 0x55aaa4acc220: i32,ch = load<LD4[ConstantPool]> 0x55aaa49dc9a0, 0x55aaa4a92dc0, undef:i64
try.c: 0x55aaa4a92dc0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55aaa4ab6fb0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55aaa4a6fb80: i64 = undef
try.c: 0x55aaa4ae56d0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55aaa4ae55a0: i32 = Constant<0>
try.c: 0x55aaa4ae55a0: i32 = Constant<0>
try.c: 0x55aaa4ae55a0: i32 = Constant<0>
try.c: 0x55aaa4ae55a0: i32 = Constant<0>
try.c: 0x55aaa4ae55a0: i32 = Constant<0>
try.c: 0x55aaa4ae55a0: i32 = Constant<0>
try.c: 0x55aaa4ae55a0: i32 = Constant<0>
try.c: 0x55aaa4ae55a0: i32 = Constant<0>
try.c: 0x55aaa4ae55a0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-sse

Compiler output

Implementation: T:dolbeau/amd64-sse
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
encrypt.c: In file included from encrypt.c:190:
encrypt.c: ./c128.h:99:7: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'chacha_noxor128' that is compiled without support for 'ssse3'
encrypt.c: VEC4_QUARTERROUND( 0, 4, 8,12);
encrypt.c: ^
encrypt.c: ./c128.h:17:36: note: expanded from macro 'VEC4_QUARTERROUND'
encrypt.c: #define VEC4_QUARTERROUND(a,b,c,d) VEC4_QUARTERROUND_SHUFFLE(a,b,c,d)
encrypt.c: ^
encrypt.c: ./c128.h:12:86: note: expanded from macro 'VEC4_QUARTERROUND_SHUFFLE'
encrypt.c: x_##a = _mm_add_epi32(x_##a, x_##b); t_##a = _mm_xor_si128(x_##d, x_##a); x_##d = _mm_shuffle_epi8(t_##a, rot16); \
encrypt.c: ^
encrypt.c: ./c128.h:99:7: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'chacha_noxor128' that is compiled without support for 'ssse3'
encrypt.c: ./c128.h:17:36: note: expanded from macro 'VEC4_QUARTERROUND'
encrypt.c: #define VEC4_QUARTERROUND(a,b,c,d) VEC4_QUARTERROUND_SHUFFLE(a,b,c,d)
encrypt.c: ^
encrypt.c: ./c128.h:14:86: note: expanded from macro 'VEC4_QUARTERROUND_SHUFFLE'
encrypt.c: x_##a = _mm_add_epi32(x_##a, x_##b); t_##a = _mm_xor_si128(x_##d, x_##a); x_##d = _mm_shuffle_epi8(t_##a, rot8); \
encrypt.c: ^
encrypt.c: ./c128.h:100:7: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'chacha_noxor128' that is compiled without support for 'ssse3'
encrypt.c: VEC4_QUARTERROUND( 1, 5, 9,13);
encrypt.c: ^
encrypt.c: ./c128.h:17:36: note: expanded from macro 'VEC4_QUARTERROUND'
encrypt.c: #define VEC4_QUARTERROUND(a,b,c,d) VEC4_QUARTERROUND_SHUFFLE(a,b,c,d)
encrypt.c: ^
encrypt.c: ./c128.h:12:86: note: expanded from macro 'VEC4_QUARTERROUND_SHUFFLE'
encrypt.c: x_##a = _mm_add_epi32(x_##a, x_##b); t_##a = _mm_xor_si128(x_##d, x_##a); x_##d = _mm_shuffle_epi8(t_##a, rot16); \
encrypt.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-sse

Compiler output

Implementation: T:faster
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x561519c409b0: v4i64 = X86ISD::VTRUNC 0x561519c40880
try.c: 0x561519c40880: v16i32 = vselect 0x561519c3d390, 0x561519bf0410, 0x561519c40750
try.c: 0x561519c3d390: v4i1 = X86ISD::PCMPGTM 0x561519c38910, 0x561519c344a0
try.c: 0x561519c38910: v4i64 = X86ISD::VBROADCAST 0x561519be05c0
try.c: 0x561519be05c0: i64,ch = load<LD8[%lsr.iv6971]> 0x561519b49950, 0x561519c2f300, undef:i64
try.c: 0x561519c2f300: i64,ch = CopyFromReg 0x561519b49950, Register:i64 %vreg50
try.c: 0x561519c34700: i64 = Register %vreg50
try.c: 0x561519be1a90: i64 = undef
try.c: 0x561519c344a0: v4i64,ch = CopyFromReg 0x561519b49950, Register:v4i64 %vreg13
try.c: 0x561519c39160: v4i64 = Register %vreg13
try.c: 0x561519bf0410: v16i32 = X86ISD::VBROADCAST 0x561519c38b70
try.c: 0x561519c38b70: i32,ch = load<LD4[ConstantPool]> 0x561519b49950, 0x561519bc7330, undef:i64
try.c: 0x561519bc7330: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x561519c1cbf0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x561519be1a90: i64 = undef
try.c: 0x561519c40750: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x561519c40620: i32 = Constant<0>
try.c: 0x561519c40620: i32 = Constant<0>
try.c: 0x561519c40620: i32 = Constant<0>
try.c: 0x561519c40620: i32 = Constant<0>
try.c: 0x561519c40620: i32 = Constant<0>
try.c: 0x561519c40620: i32 = Constant<0>
try.c: 0x561519c40620: i32 = Constant<0>
try.c: 0x561519c40620: i32 = Constant<0>
try.c: 0x561519c40620: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:faster

Compiler output

Implementation: T:faster
Security model: timingleaks
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55bf49b649a0: v4i64 = X86ISD::VTRUNC 0x55bf49b64870
try.c: 0x55bf49b64870: v16i32 = vselect 0x55bf49b313b0, 0x55bf49ae6f60, 0x55bf49b64740
try.c: 0x55bf49b313b0: v4i1 = X86ISD::PCMPGTM 0x55bf49b4bbb0, 0x55bf49b46f80
try.c: 0x55bf49b4bbb0: v4i64 = X86ISD::VBROADCAST 0x55bf49ae7420
try.c: 0x55bf49ae7420: i64,ch = load<LD8[%lsr.iv6971]> 0x55bf49a45a30, 0x55bf49aecde0, undef:i64
try.c: 0x55bf49aecde0: i64,ch = CopyFromReg 0x55bf49a45a30, Register:i64 %vreg50
try.c: 0x55bf49b471e0: i64 = Register %vreg50
try.c: 0x55bf49ae47a0: i64 = undef
try.c: 0x55bf49b46f80: v4i64,ch = CopyFromReg 0x55bf49a45a30, Register:v4i64 %vreg13
try.c: 0x55bf49b4c400: v4i64 = Register %vreg13
try.c: 0x55bf49ae6f60: v16i32 = X86ISD::VBROADCAST 0x55bf49b4be10
try.c: 0x55bf49b4be10: i32,ch = load<LD4[ConstantPool]> 0x55bf49a45a30, 0x55bf49aeb3b0, undef:i64
try.c: 0x55bf49aeb3b0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55bf49ae5120: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55bf49ae47a0: i64 = undef
try.c: 0x55bf49b64740: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55bf49b64610: i32 = Constant<0>
try.c: 0x55bf49b64610: i32 = Constant<0>
try.c: 0x55bf49b64610: i32 = Constant<0>
try.c: 0x55bf49b64610: i32 = Constant<0>
try.c: 0x55bf49b64610: i32 = Constant<0>
try.c: 0x55bf49b64610: i32 = Constant<0>
try.c: 0x55bf49b64610: i32 = Constant<0>
try.c: 0x55bf49b64610: i32 = Constant<0>
try.c: 0x55bf49b64610: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:faster

Compiler output

Implementation: T:faster
Security model: timingleaks
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55601ee906b0: v4i64 = X86ISD::VTRUNC 0x55601ee90580
try.c: 0x55601ee90580: v16i32 = vselect 0x55601ee81800, 0x55601ee1d6b0, 0x55601ee90450
try.c: 0x55601ee81800: v4i1 = X86ISD::PCMPGTM 0x55601ee75d30, 0x55601ee718c0
try.c: 0x55601ee75d30: v4i64 = X86ISD::VBROADCAST 0x55601ee156a0
try.c: 0x55601ee156a0: i64,ch = load<LD8[%lsr.iv6971]> 0x55601ed86950, 0x55601ee567f0, undef:i64
try.c: 0x55601ee567f0: i64,ch = CopyFromReg 0x55601ed86950, Register:i64 %vreg50
try.c: 0x55601ee71b20: i64 = Register %vreg50
try.c: 0x55601ee1bd20: i64 = undef
try.c: 0x55601ee718c0: v4i64,ch = CopyFromReg 0x55601ed86950, Register:v4i64 %vreg13
try.c: 0x55601ee76580: v4i64 = Register %vreg13
try.c: 0x55601ee1d6b0: v16i32 = X86ISD::VBROADCAST 0x55601ee75f90
try.c: 0x55601ee75f90: i32,ch = load<LD4[ConstantPool]> 0x55601ed86950, 0x55601ee14c80, undef:i64
try.c: 0x55601ee14c80: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55601ee60db0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55601ee1bd20: i64 = undef
try.c: 0x55601ee90450: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55601ee90320: i32 = Constant<0>
try.c: 0x55601ee90320: i32 = Constant<0>
try.c: 0x55601ee90320: i32 = Constant<0>
try.c: 0x55601ee90320: i32 = Constant<0>
try.c: 0x55601ee90320: i32 = Constant<0>
try.c: 0x55601ee90320: i32 = Constant<0>
try.c: 0x55601ee90320: i32 = Constant<0>
try.c: 0x55601ee90320: i32 = Constant<0>
try.c: 0x55601ee90320: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:faster

Compiler output

Implementation: T:ref
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x5594063b9250: v4i64 = X86ISD::VTRUNC 0x5594063b9120
try.c: 0x5594063b9120: v16i32 = vselect 0x559406395550, 0x559406317f90, 0x5594063b8ff0
try.c: 0x559406395550: v4i1 = X86ISD::PCMPGTM 0x559406394540, 0x559406392060
try.c: 0x559406394540: v4i64 = X86ISD::VBROADCAST 0x559406351980
try.c: 0x559406351980: i64,ch = load<LD8[%lsr.iv6971]> 0x5594062a6950, 0x559406383e50, undef:i64
try.c: 0x559406383e50: i64,ch = CopyFromReg 0x5594062a6950, Register:i64 %vreg50
try.c: 0x5594063922c0: i64 = Register %vreg50
try.c: 0x559406316600: i64 = undef
try.c: 0x559406392060: v4i64,ch = CopyFromReg 0x5594062a6950, Register:v4i64 %vreg13
try.c: 0x559406394d90: v4i64 = Register %vreg13
try.c: 0x559406317f90: v16i32 = X86ISD::VBROADCAST 0x5594063947a0
try.c: 0x5594063947a0: i32,ch = load<LD4[ConstantPool]> 0x5594062a6950, 0x559406350f60, undef:i64
try.c: 0x559406350f60: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x559406337140: i64 = TargetConstantPool<i32 1> 0
try.c: 0x559406316600: i64 = undef
try.c: 0x5594063b8ff0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x5594063b8ec0: i32 = Constant<0>
try.c: 0x5594063b8ec0: i32 = Constant<0>
try.c: 0x5594063b8ec0: i32 = Constant<0>
try.c: 0x5594063b8ec0: i32 = Constant<0>
try.c: 0x5594063b8ec0: i32 = Constant<0>
try.c: 0x5594063b8ec0: i32 = Constant<0>
try.c: 0x5594063b8ec0: i32 = Constant<0>
try.c: 0x5594063b8ec0: i32 = Constant<0>
try.c: 0x5594063b8ec0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref

Compiler output

Implementation: T:ref
Security model: timingleaks
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x5586381094d0: v4i64 = X86ISD::VTRUNC 0x5586381093a0
try.c: 0x5586381093a0: v16i32 = vselect 0x55863810ccd0, 0x5586380b0e60, 0x558638109270
try.c: 0x55863810ccd0: v4i1 = X86ISD::PCMPGTM 0x558638101e90, 0x5586380fda20
try.c: 0x558638101e90: v4i64 = X86ISD::VBROADCAST 0x5586380b1320
try.c: 0x5586380b1320: i64,ch = load<LD8[%lsr.iv6971]> 0x558637ffaa30, 0x55863809e060, undef:i64
try.c: 0x55863809e060: i64,ch = CopyFromReg 0x558637ffaa30, Register:i64 %vreg50
try.c: 0x5586380fdc80: i64 = Register %vreg50
try.c: 0x558638078050: i64 = undef
try.c: 0x5586380fda20: v4i64,ch = CopyFromReg 0x558637ffaa30, Register:v4i64 %vreg13
try.c: 0x5586381026e0: v4i64 = Register %vreg13
try.c: 0x5586380b0e60: v16i32 = X86ISD::VBROADCAST 0x5586381020f0
try.c: 0x5586381020f0: i32,ch = load<LD4[ConstantPool]> 0x558637ffaa30, 0x558638096860, undef:i64
try.c: 0x558638096860: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x5586380789d0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x558638078050: i64 = undef
try.c: 0x558638109270: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x558638109140: i32 = Constant<0>
try.c: 0x558638109140: i32 = Constant<0>
try.c: 0x558638109140: i32 = Constant<0>
try.c: 0x558638109140: i32 = Constant<0>
try.c: 0x558638109140: i32 = Constant<0>
try.c: 0x558638109140: i32 = Constant<0>
try.c: 0x558638109140: i32 = Constant<0>
try.c: 0x558638109140: i32 = Constant<0>
try.c: 0x558638109140: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref

Compiler output

Implementation: T:ref
Security model: timingleaks
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55b6ef27e160: v4i64 = X86ISD::VTRUNC 0x55b6ef27e030
try.c: 0x55b6ef27e030: v16i32 = vselect 0x55b6ef26aec0, 0x55b6ef2099e0, 0x55b6ef27df00
try.c: 0x55b6ef26aec0: v4i1 = X86ISD::PCMPGTM 0x55b6ef2647f0, 0x55b6ef260380
try.c: 0x55b6ef2647f0: v4i64 = X86ISD::VBROADCAST 0x55b6ef20e870
try.c: 0x55b6ef20e870: i64,ch = load<LD8[%lsr.iv6971]> 0x55b6ef175950, 0x55b6ef24e180, undef:i64
try.c: 0x55b6ef24e180: i64,ch = CopyFromReg 0x55b6ef175950, Register:i64 %vreg50
try.c: 0x55b6ef2605e0: i64 = Register %vreg50
try.c: 0x55b6ef208050: i64 = undef
try.c: 0x55b6ef260380: v4i64,ch = CopyFromReg 0x55b6ef175950, Register:v4i64 %vreg13
try.c: 0x55b6ef265040: v4i64 = Register %vreg13
try.c: 0x55b6ef2099e0: v16i32 = X86ISD::VBROADCAST 0x55b6ef264a50
try.c: 0x55b6ef264a50: i32,ch = load<LD4[ConstantPool]> 0x55b6ef175950, 0x55b6ef20de50, undef:i64
try.c: 0x55b6ef20de50: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55b6ef24f380: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55b6ef208050: i64 = undef
try.c: 0x55b6ef27df00: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55b6ef27ddd0: i32 = Constant<0>
try.c: 0x55b6ef27ddd0: i32 = Constant<0>
try.c: 0x55b6ef27ddd0: i32 = Constant<0>
try.c: 0x55b6ef27ddd0: i32 = Constant<0>
try.c: 0x55b6ef27ddd0: i32 = Constant<0>
try.c: 0x55b6ef27ddd0: i32 = Constant<0>
try.c: 0x55b6ef27ddd0: i32 = Constant<0>
try.c: 0x55b6ef27ddd0: i32 = Constant<0>
try.c: 0x55b6ef27ddd0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref