Implementation notes: amd64, cel02, crypto_decode/857x3

Computer: cel02
Architecture: amd64
CPU ID: GenuineIntel-00050657-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_decode
Primitive: 857x3
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
120342 0 09692 792 728avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
126411 0 010580 816 768avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
144403 0 09656 800 768avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
154412 0 010828 816 768avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
166412 0 014085 824 800avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
3722356 0 016069 824 800refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
1330138 0 010276 816 768refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
2402131 0 09476 792 728refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
2402129 0 09336 800 768refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
2416126 0 011322 800 728refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
2800145 0 010540 816 768refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x558f08163250: v4i64 = X86ISD::VTRUNC 0x558f08163120
try.c: 0x558f08163120: v16i32 = vselect 0x558f0817fc70, 0x558f08100960, 0x558f08162ff0
try.c: 0x558f0817fc70: v4i1 = X86ISD::PCMPGTM 0x558f0815c800, 0x558f08158390
try.c: 0x558f0815c800: v4i64 = X86ISD::VBROADCAST 0x558f08102c70
try.c: 0x558f08102c70: i64,ch = load<LD8[%lsr.iv6971]> 0x558f0806d950, 0x558f08141880, undef:i64
try.c: 0x558f08141880: i64,ch = CopyFromReg 0x558f0806d950, Register:i64 %vreg50
try.c: 0x558f081585f0: i64 = Register %vreg50
try.c: 0x558f08104140: i64 = undef
try.c: 0x558f08158390: v4i64,ch = CopyFromReg 0x558f0806d950, Register:v4i64 %vreg13
try.c: 0x558f0815d050: v4i64 = Register %vreg13
try.c: 0x558f08100960: v16i32 = X86ISD::VBROADCAST 0x558f0815ca60
try.c: 0x558f0815ca60: i32,ch = load<LD4[ConstantPool]> 0x558f0806d950, 0x558f080fd120, undef:i64
try.c: 0x558f080fd120: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x558f081461e0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x558f08104140: i64 = undef
try.c: 0x558f08162ff0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x558f08162ec0: i32 = Constant<0>
try.c: 0x558f08162ec0: i32 = Constant<0>
try.c: 0x558f08162ec0: i32 = Constant<0>
try.c: 0x558f08162ec0: i32 = Constant<0>
try.c: 0x558f08162ec0: i32 = Constant<0>
try.c: 0x558f08162ec0: i32 = Constant<0>
try.c: 0x558f08162ec0: i32 = Constant<0>
try.c: 0x558f08162ec0: i32 = Constant<0>
try.c: 0x558f08162ec0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x563147063bb0: v4i64 = X86ISD::VTRUNC 0x563147063a80
try.c: 0x563147063a80: v16i32 = vselect 0x5631470673b0, 0x563146fef4d0, 0x563147063950
try.c: 0x5631470673b0: v4i1 = X86ISD::PCMPGTM 0x56314705c570, 0x563147058100
try.c: 0x56314705c570: v4i64 = X86ISD::VBROADCAST 0x563146fef990
try.c: 0x563146fef990: i64,ch = load<LD8[%lsr.iv6971]> 0x563146f55a30, 0x563146ff38b0, undef:i64
try.c: 0x563146ff38b0: i64,ch = CopyFromReg 0x563146f55a30, Register:i64 %vreg50
try.c: 0x563147058360: i64 = Register %vreg50
try.c: 0x563146feaa60: i64 = undef
try.c: 0x563147058100: v4i64,ch = CopyFromReg 0x563146f55a30, Register:v4i64 %vreg13
try.c: 0x56314705cdc0: v4i64 = Register %vreg13
try.c: 0x563146fef4d0: v16i32 = X86ISD::VBROADCAST 0x56314705c7d0
try.c: 0x56314705c7d0: i32,ch = load<LD4[ConstantPool]> 0x563146f55a30, 0x563146ff8730, undef:i64
try.c: 0x563146ff8730: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x563146feb3e0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x563146feaa60: i64 = undef
try.c: 0x563147063950: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x563147063820: i32 = Constant<0>
try.c: 0x563147063820: i32 = Constant<0>
try.c: 0x563147063820: i32 = Constant<0>
try.c: 0x563147063820: i32 = Constant<0>
try.c: 0x563147063820: i32 = Constant<0>
try.c: 0x563147063820: i32 = Constant<0>
try.c: 0x563147063820: i32 = Constant<0>
try.c: 0x563147063820: i32 = Constant<0>
try.c: 0x563147063820: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55ad33c2c420: v4i64 = X86ISD::VTRUNC 0x55ad33c2c2f0
try.c: 0x55ad33c2c2f0: v16i32 = vselect 0x55ad33c28e00, 0x55ad33bd8fd0, 0x55ad33c2c1c0
try.c: 0x55ad33c28e00: v4i1 = X86ISD::PCMPGTM 0x55ad33c11a90, 0x55ad33c0d620
try.c: 0x55ad33c11a90: v4i64 = X86ISD::VBROADCAST 0x55ad33bb2b10
try.c: 0x55ad33bb2b10: i64,ch = load<LD8[%lsr.iv6971]> 0x55ad33b22930, 0x55ad33bfd400, undef:i64
try.c: 0x55ad33bfd400: i64,ch = CopyFromReg 0x55ad33b22930, Register:i64 %vreg50
try.c: 0x55ad33c0d880: i64 = Register %vreg50
try.c: 0x55ad33bd7640: i64 = undef
try.c: 0x55ad33c0d620: v4i64,ch = CopyFromReg 0x55ad33b22930, Register:v4i64 %vreg13
try.c: 0x55ad33c122e0: v4i64 = Register %vreg13
try.c: 0x55ad33bd8fd0: v16i32 = X86ISD::VBROADCAST 0x55ad33c11cf0
try.c: 0x55ad33c11cf0: i32,ch = load<LD4[ConstantPool]> 0x55ad33b22930, 0x55ad33bb20f0, undef:i64
try.c: 0x55ad33bb20f0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55ad33bf66c0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55ad33bd7640: i64 = undef
try.c: 0x55ad33c2c1c0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55ad33c2c090: i32 = Constant<0>
try.c: 0x55ad33c2c090: i32 = Constant<0>
try.c: 0x55ad33c2c090: i32 = Constant<0>
try.c: 0x55ad33c2c090: i32 = Constant<0>
try.c: 0x55ad33c2c090: i32 = Constant<0>
try.c: 0x55ad33c2c090: i32 = Constant<0>
try.c: 0x55ad33c2c090: i32 = Constant<0>
try.c: 0x55ad33c2c090: i32 = Constant<0>
try.c: 0x55ad33c2c090: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
decode.c: decode.c:18:18: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_857x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i s0 = _mm256_loadu_si256((const __m256i *) s);
decode.c: ^
decode.c: decode.c:22:18: error: always_inline function '_mm256_srli_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_857x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i s1 = _mm256_srli_epi16(s0&_mm256_set1_epi8(-16),4);
decode.c: ^
decode.c: decode.c:22:39: error: always_inline function '_mm256_set1_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_857x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i s1 = _mm256_srli_epi16(s0&_mm256_set1_epi8(-16),4);
decode.c: ^
decode.c: decode.c:23:11: error: always_inline function '_mm256_set1_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_857x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: s0 &= _mm256_set1_epi8(15);
decode.c: ^
decode.c: decode.c:25:18: error: always_inline function '_mm256_unpacklo_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_857x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i a0 = _mm256_unpacklo_epi8(s0,s1);
decode.c: ^
decode.c: decode.c:28:18: error: always_inline function '_mm256_unpackhi_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_857x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i a1 = _mm256_unpackhi_epi8(s0,s1);
decode.c: ^
decode.c: decode.c:32:18: error: always_inline function '_mm256_srli_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_857x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i a2 = _mm256_srli_epi16(a0&_mm256_set1_epi8(12),2);
decode.c: ^
decode.c: decode.c:32:39: error: always_inline function '_mm256_set1_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_857x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i a2 = _mm256_srli_epi16(a0&_mm256_set1_epi8(12),2);
decode.c: ^
decode.c: decode.c:33:18: error: always_inline function '_mm256_srli_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_857x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55560bb63c40: v4i64 = X86ISD::VTRUNC 0x55560bb63b10
try.c: 0x55560bb63b10: v16i32 = vselect 0x55560bb60620, 0x55560bb03920, 0x55560bb639e0
try.c: 0x55560bb60620: v4i1 = X86ISD::PCMPGTM 0x55560bb5b9d0, 0x55560bb57560
try.c: 0x55560bb5b9d0: v4i64 = X86ISD::VBROADCAST 0x55560baffb10
try.c: 0x55560baffb10: i64,ch = load<LD8[%lsr.iv6971]> 0x55560ba6c950, 0x55560bb45d50, undef:i64
try.c: 0x55560bb45d50: i64,ch = CopyFromReg 0x55560ba6c950, Register:i64 %vreg50
try.c: 0x55560bb577c0: i64 = Register %vreg50
try.c: 0x55560bb01f90: i64 = undef
try.c: 0x55560bb57560: v4i64,ch = CopyFromReg 0x55560ba6c950, Register:v4i64 %vreg13
try.c: 0x55560bb5c220: v4i64 = Register %vreg13
try.c: 0x55560bb03920: v16i32 = X86ISD::VBROADCAST 0x55560bb5bc30
try.c: 0x55560bb5bc30: i32,ch = load<LD4[ConstantPool]> 0x55560ba6c950, 0x55560baff0f0, undef:i64
try.c: 0x55560baff0f0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55560bae4e40: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55560bb01f90: i64 = undef
try.c: 0x55560bb639e0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55560bb638b0: i32 = Constant<0>
try.c: 0x55560bb638b0: i32 = Constant<0>
try.c: 0x55560bb638b0: i32 = Constant<0>
try.c: 0x55560bb638b0: i32 = Constant<0>
try.c: 0x55560bb638b0: i32 = Constant<0>
try.c: 0x55560bb638b0: i32 = Constant<0>
try.c: 0x55560bb638b0: i32 = Constant<0>
try.c: 0x55560bb638b0: i32 = Constant<0>
try.c: 0x55560bb638b0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55e53ba7aeb0: v4i64 = X86ISD::VTRUNC 0x55e53ba7ad80
try.c: 0x55e53ba7ad80: v16i32 = vselect 0x55e53ba5dd20, 0x55e53b9eece0, 0x55e53ba7ac50
try.c: 0x55e53ba5dd20: v4i1 = X86ISD::PCMPGTM 0x55e53ba570c0, 0x55e53ba52c50
try.c: 0x55e53ba570c0: v4i64 = X86ISD::VBROADCAST 0x55e53b9ef1a0
try.c: 0x55e53b9ef1a0: i64,ch = load<LD8[%lsr.iv6971]> 0x55e53b950a40, 0x55e53b9f4250, undef:i64
try.c: 0x55e53b9f4250: i64,ch = CopyFromReg 0x55e53b950a40, Register:i64 %vreg50
try.c: 0x55e53ba52eb0: i64 = Register %vreg50
try.c: 0x55e53b9eb000: i64 = undef
try.c: 0x55e53ba52c50: v4i64,ch = CopyFromReg 0x55e53b950a40, Register:v4i64 %vreg13
try.c: 0x55e53ba57910: v4i64 = Register %vreg13
try.c: 0x55e53b9eece0: v16i32 = X86ISD::VBROADCAST 0x55e53ba57320
try.c: 0x55e53ba57320: i32,ch = load<LD4[ConstantPool]> 0x55e53b950a40, 0x55e53b9f2820, undef:i64
try.c: 0x55e53b9f2820: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55e53b9eb980: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55e53b9eb000: i64 = undef
try.c: 0x55e53ba7ac50: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55e53ba7ab20: i32 = Constant<0>
try.c: 0x55e53ba7ab20: i32 = Constant<0>
try.c: 0x55e53ba7ab20: i32 = Constant<0>
try.c: 0x55e53ba7ab20: i32 = Constant<0>
try.c: 0x55e53ba7ab20: i32 = Constant<0>
try.c: 0x55e53ba7ab20: i32 = Constant<0>
try.c: 0x55e53ba7ab20: i32 = Constant<0>
try.c: 0x55e53ba7ab20: i32 = Constant<0>
try.c: 0x55e53ba7ab20: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x562436628320: v4i64 = X86ISD::VTRUNC 0x5624366281f0
try.c: 0x5624366281f0: v16i32 = vselect 0x562436614350, 0x5624365c9330, 0x5624366280c0
try.c: 0x562436614350: v4i1 = X86ISD::PCMPGTM 0x56243660d9a0, 0x562436609530
try.c: 0x56243660d9a0: v4i64 = X86ISD::VBROADCAST 0x5624365c64d0
try.c: 0x5624365c64d0: i64,ch = load<LD8[%lsr.iv6971]> 0x56243651e9d0, 0x5624365f79e0, undef:i64
try.c: 0x5624365f79e0: i64,ch = CopyFromReg 0x56243651e9d0, Register:i64 %vreg50
try.c: 0x562436609790: i64 = Register %vreg50
try.c: 0x5624365c79a0: i64 = undef
try.c: 0x562436609530: v4i64,ch = CopyFromReg 0x56243651e9d0, Register:v4i64 %vreg13
try.c: 0x56243660e1f0: v4i64 = Register %vreg13
try.c: 0x5624365c9330: v16i32 = X86ISD::VBROADCAST 0x56243660dc00
try.c: 0x56243660dc00: i32,ch = load<LD4[ConstantPool]> 0x56243651e9d0, 0x5624365b21a0, undef:i64
try.c: 0x5624365b21a0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x5624365f8be0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x5624365c79a0: i64 = undef
try.c: 0x5624366280c0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x562436627f90: i32 = Constant<0>
try.c: 0x562436627f90: i32 = Constant<0>
try.c: 0x562436627f90: i32 = Constant<0>
try.c: 0x562436627f90: i32 = Constant<0>
try.c: 0x562436627f90: i32 = Constant<0>
try.c: 0x562436627f90: i32 = Constant<0>
try.c: 0x562436627f90: i32 = Constant<0>
try.c: 0x562436627f90: i32 = Constant<0>
try.c: 0x562436627f90: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref