Implementation notes: amd64, cel02, crypto_decode/653x3

Computer: cel02
Architecture: amd64
CPU ID: GenuineIntel-00050657-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_decode
Primitive: 653x3
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
80412 0 014085 824 800avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
82412 0 010828 816 768avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
102403 0 09656 800 768avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
104342 0 09692 792 728avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
108411 0 010580 816 768avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
4381820 0 015525 824 800refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
1508145 0 010540 816 768refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
1830131 0 09476 792 728refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
1838126 0 011322 800 728refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
1858129 0 09336 800 768refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
2120138 0 010276 816 768refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x565062156080: v4i64 = X86ISD::VTRUNC 0x565062155f50
try.c: 0x565062155f50: v16i32 = vselect 0x565062148210, 0x5650620e3e30, 0x565062155e20
try.c: 0x565062148210: v4i1 = X86ISD::PCMPGTM 0x56506213c710, 0x5650621382a0
try.c: 0x56506213c710: v4i64 = X86ISD::VBROADCAST 0x5650620f7330
try.c: 0x5650620f7330: i64,ch = load<LD8[%lsr.iv6971]> 0x56506204d950, 0x565062133100, undef:i64
try.c: 0x565062133100: i64,ch = CopyFromReg 0x56506204d950, Register:i64 %vreg50
try.c: 0x565062138500: i64 = Register %vreg50
try.c: 0x5650620f8800: i64 = undef
try.c: 0x5650621382a0: v4i64,ch = CopyFromReg 0x56506204d950, Register:v4i64 %vreg13
try.c: 0x56506213cf60: v4i64 = Register %vreg13
try.c: 0x5650620e3e30: v16i32 = X86ISD::VBROADCAST 0x56506213c970
try.c: 0x56506213c970: i32,ch = load<LD4[ConstantPool]> 0x56506204d950, 0x5650620f6910, undef:i64
try.c: 0x5650620f6910: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x56506211cfb0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x5650620f8800: i64 = undef
try.c: 0x565062155e20: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x565062155cf0: i32 = Constant<0>
try.c: 0x565062155cf0: i32 = Constant<0>
try.c: 0x565062155cf0: i32 = Constant<0>
try.c: 0x565062155cf0: i32 = Constant<0>
try.c: 0x565062155cf0: i32 = Constant<0>
try.c: 0x565062155cf0: i32 = Constant<0>
try.c: 0x565062155cf0: i32 = Constant<0>
try.c: 0x565062155cf0: i32 = Constant<0>
try.c: 0x565062155cf0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55814dc558d0: v4i64 = X86ISD::VTRUNC 0x55814dc557a0
try.c: 0x55814dc557a0: v16i32 = vselect 0x55814dc41730, 0x55814dbd9180, 0x55814dc55670
try.c: 0x55814dc41730: v4i1 = X86ISD::PCMPGTM 0x55814dc3b110, 0x55814dc376a0
try.c: 0x55814dc3b110: v4i64 = X86ISD::VBROADCAST 0x55814dbd9640
try.c: 0x55814dbd9640: i64,ch = load<LD8[%lsr.iv6971]> 0x55814db35a20, 0x55814dbfd4e0, undef:i64
try.c: 0x55814dbfd4e0: i64,ch = CopyFromReg 0x55814db35a20, Register:i64 %vreg50
try.c: 0x55814dc37900: i64 = Register %vreg50
try.c: 0x55814dbc6dc0: i64 = undef
try.c: 0x55814dc376a0: v4i64,ch = CopyFromReg 0x55814db35a20, Register:v4i64 %vreg13
try.c: 0x55814dc3b960: v4i64 = Register %vreg13
try.c: 0x55814dbd9180: v16i32 = X86ISD::VBROADCAST 0x55814dc3b370
try.c: 0x55814dc3b370: i32,ch = load<LD4[ConstantPool]> 0x55814db35a20, 0x55814dbfb2a0, undef:i64
try.c: 0x55814dbfb2a0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55814dbc7740: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55814dbc6dc0: i64 = undef
try.c: 0x55814dc55670: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55814dc55540: i32 = Constant<0>
try.c: 0x55814dc55540: i32 = Constant<0>
try.c: 0x55814dc55540: i32 = Constant<0>
try.c: 0x55814dc55540: i32 = Constant<0>
try.c: 0x55814dc55540: i32 = Constant<0>
try.c: 0x55814dc55540: i32 = Constant<0>
try.c: 0x55814dc55540: i32 = Constant<0>
try.c: 0x55814dc55540: i32 = Constant<0>
try.c: 0x55814dc55540: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55e1b4732af0: v4i64 = X86ISD::VTRUNC 0x55e1b47329c0
try.c: 0x55e1b47329c0: v16i32 = vselect 0x55e1b473a5b0, 0x55e1b46c4980, 0x55e1b4732890
try.c: 0x55e1b473a5b0: v4i1 = X86ISD::PCMPGTM 0x55e1b4718170, 0x55e1b4713d00
try.c: 0x55e1b4718170: v4i64 = X86ISD::VBROADCAST 0x55e1b46b54d0
try.c: 0x55e1b46b54d0: i64,ch = load<LD8[%lsr.iv6971]> 0x55e1b4628950, 0x55e1b470a090, undef:i64
try.c: 0x55e1b470a090: i64,ch = CopyFromReg 0x55e1b4628950, Register:i64 %vreg50
try.c: 0x55e1b4713f60: i64 = Register %vreg50
try.c: 0x55e1b46c2ff0: i64 = undef
try.c: 0x55e1b4713d00: v4i64,ch = CopyFromReg 0x55e1b4628950, Register:v4i64 %vreg13
try.c: 0x55e1b47189c0: v4i64 = Register %vreg13
try.c: 0x55e1b46c4980: v16i32 = X86ISD::VBROADCAST 0x55e1b47183d0
try.c: 0x55e1b47183d0: i32,ch = load<LD4[ConstantPool]> 0x55e1b4628950, 0x55e1b46b4ab0, undef:i64
try.c: 0x55e1b46b4ab0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55e1b46b7800: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55e1b46c2ff0: i64 = undef
try.c: 0x55e1b4732890: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55e1b4732760: i32 = Constant<0>
try.c: 0x55e1b4732760: i32 = Constant<0>
try.c: 0x55e1b4732760: i32 = Constant<0>
try.c: 0x55e1b4732760: i32 = Constant<0>
try.c: 0x55e1b4732760: i32 = Constant<0>
try.c: 0x55e1b4732760: i32 = Constant<0>
try.c: 0x55e1b4732760: i32 = Constant<0>
try.c: 0x55e1b4732760: i32 = Constant<0>
try.c: 0x55e1b4732760: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
decode.c: decode.c:18:18: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_653x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i s0 = _mm256_loadu_si256((const __m256i *) s);
decode.c: ^
decode.c: decode.c:22:18: error: always_inline function '_mm256_srli_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_653x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i s1 = _mm256_srli_epi16(s0&_mm256_set1_epi8(-16),4);
decode.c: ^
decode.c: decode.c:22:39: error: always_inline function '_mm256_set1_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_653x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i s1 = _mm256_srli_epi16(s0&_mm256_set1_epi8(-16),4);
decode.c: ^
decode.c: decode.c:23:11: error: always_inline function '_mm256_set1_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_653x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: s0 &= _mm256_set1_epi8(15);
decode.c: ^
decode.c: decode.c:25:18: error: always_inline function '_mm256_unpacklo_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_653x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i a0 = _mm256_unpacklo_epi8(s0,s1);
decode.c: ^
decode.c: decode.c:28:18: error: always_inline function '_mm256_unpackhi_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_653x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i a1 = _mm256_unpackhi_epi8(s0,s1);
decode.c: ^
decode.c: decode.c:32:18: error: always_inline function '_mm256_srli_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_653x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i a2 = _mm256_srli_epi16(a0&_mm256_set1_epi8(12),2);
decode.c: ^
decode.c: decode.c:32:39: error: always_inline function '_mm256_set1_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_653x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i a2 = _mm256_srli_epi16(a0&_mm256_set1_epi8(12),2);
decode.c: ^
decode.c: decode.c:33:18: error: always_inline function '_mm256_srli_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_653x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x555c4e1d7350: v4i64 = X86ISD::VTRUNC 0x555c4e1d7220
try.c: 0x555c4e1d7220: v16i32 = vselect 0x555c4e1cb4b0, 0x555c4e15fab0, 0x555c4e1d70f0
try.c: 0x555c4e1cb4b0: v4i1 = X86ISD::PCMPGTM 0x555c4e1bf9f0, 0x555c4e1bb580
try.c: 0x555c4e1bf9f0: v4i64 = X86ISD::VBROADCAST 0x555c4e1638d0
try.c: 0x555c4e1638d0: i64,ch = load<LD8[%lsr.iv6971]> 0x555c4e0d0950, 0x555c4e1b63e0, undef:i64
try.c: 0x555c4e1b63e0: i64,ch = CopyFromReg 0x555c4e0d0950, Register:i64 %vreg50
try.c: 0x555c4e1bb7e0: i64 = Register %vreg50
try.c: 0x555c4e15e120: i64 = undef
try.c: 0x555c4e1bb580: v4i64,ch = CopyFromReg 0x555c4e0d0950, Register:v4i64 %vreg13
try.c: 0x555c4e1c0240: v4i64 = Register %vreg13
try.c: 0x555c4e15fab0: v16i32 = X86ISD::VBROADCAST 0x555c4e1bfc50
try.c: 0x555c4e1bfc50: i32,ch = load<LD4[ConstantPool]> 0x555c4e0d0950, 0x555c4e162eb0, undef:i64
try.c: 0x555c4e162eb0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x555c4e1b3640: i64 = TargetConstantPool<i32 1> 0
try.c: 0x555c4e15e120: i64 = undef
try.c: 0x555c4e1d70f0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x555c4e1d6fc0: i32 = Constant<0>
try.c: 0x555c4e1d6fc0: i32 = Constant<0>
try.c: 0x555c4e1d6fc0: i32 = Constant<0>
try.c: 0x555c4e1d6fc0: i32 = Constant<0>
try.c: 0x555c4e1d6fc0: i32 = Constant<0>
try.c: 0x555c4e1d6fc0: i32 = Constant<0>
try.c: 0x555c4e1d6fc0: i32 = Constant<0>
try.c: 0x555c4e1d6fc0: i32 = Constant<0>
try.c: 0x555c4e1d6fc0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55cbc35e66d0: v4i64 = X86ISD::VTRUNC 0x55cbc35e65a0
try.c: 0x55cbc35e65a0: v16i32 = vselect 0x55cbc35d8bc0, 0x55cbc3563bf0, 0x55cbc35e6470
try.c: 0x55cbc35d8bc0: v4i1 = X86ISD::PCMPGTM 0x55cbc35ced80, 0x55cbc35ca910
try.c: 0x55cbc35ced80: v4i64 = X86ISD::VBROADCAST 0x55cbc35640b0
try.c: 0x55cbc35640b0: i64,ch = load<LD8[%lsr.iv6971]> 0x55cbc34c8a20, 0x55cbc356ba10, undef:i64
try.c: 0x55cbc356ba10: i64,ch = CopyFromReg 0x55cbc34c8a20, Register:i64 %vreg50
try.c: 0x55cbc35cab70: i64 = Register %vreg50
try.c: 0x55cbc357d4d0: i64 = undef
try.c: 0x55cbc35ca910: v4i64,ch = CopyFromReg 0x55cbc34c8a20, Register:v4i64 %vreg13
try.c: 0x55cbc35cf5d0: v4i64 = Register %vreg13
try.c: 0x55cbc3563bf0: v16i32 = X86ISD::VBROADCAST 0x55cbc35cefe0
try.c: 0x55cbc35cefe0: i32,ch = load<LD4[ConstantPool]> 0x55cbc34c8a20, 0x55cbc3569fe0, undef:i64
try.c: 0x55cbc3569fe0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55cbc357de50: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55cbc357d4d0: i64 = undef
try.c: 0x55cbc35e6470: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55cbc35e6340: i32 = Constant<0>
try.c: 0x55cbc35e6340: i32 = Constant<0>
try.c: 0x55cbc35e6340: i32 = Constant<0>
try.c: 0x55cbc35e6340: i32 = Constant<0>
try.c: 0x55cbc35e6340: i32 = Constant<0>
try.c: 0x55cbc35e6340: i32 = Constant<0>
try.c: 0x55cbc35e6340: i32 = Constant<0>
try.c: 0x55cbc35e6340: i32 = Constant<0>
try.c: 0x55cbc35e6340: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x558985676040: v4i64 = X86ISD::VTRUNC 0x558985675f10
try.c: 0x558985675f10: v16i32 = vselect 0x5589856698f0, 0x5589856171d0, 0x558985675de0
try.c: 0x5589856698f0: v4i1 = X86ISD::PCMPGTM 0x55898566cea0, 0x558985668420
try.c: 0x55898566cea0: v4i64 = X86ISD::VBROADCAST 0x558985614370
try.c: 0x558985614370: i64,ch = load<LD8[%lsr.iv6971]> 0x55898557d950, 0x558985656a80, undef:i64
try.c: 0x558985656a80: i64,ch = CopyFromReg 0x55898557d950, Register:i64 %vreg50
try.c: 0x558985668680: i64 = Register %vreg50
try.c: 0x558985615840: i64 = undef
try.c: 0x558985668420: v4i64,ch = CopyFromReg 0x55898557d950, Register:v4i64 %vreg13
try.c: 0x55898566d6f0: v4i64 = Register %vreg13
try.c: 0x5589856171d0: v16i32 = X86ISD::VBROADCAST 0x55898566d100
try.c: 0x55898566d100: i32,ch = load<LD4[ConstantPool]> 0x55898557d950, 0x558985613950, undef:i64
try.c: 0x558985613950: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x5589855cddf0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x558985615840: i64 = undef
try.c: 0x558985675de0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x558985675cb0: i32 = Constant<0>
try.c: 0x558985675cb0: i32 = Constant<0>
try.c: 0x558985675cb0: i32 = Constant<0>
try.c: 0x558985675cb0: i32 = Constant<0>
try.c: 0x558985675cb0: i32 = Constant<0>
try.c: 0x558985675cb0: i32 = Constant<0>
try.c: 0x558985675cb0: i32 = Constant<0>
try.c: 0x558985675cb0: i32 = Constant<0>
try.c: 0x558985675cb0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref