Implementation notes: amd64, cel02, crypto_decode/761x3

Computer: cel02
Architecture: amd64
CPU ID: GenuineIntel-00050657-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_decode
Primitive: 761x3
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
104342 0 09692 792 728avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
106412 0 014085 824 800avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
128411 0 010580 816 768avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
148403 0 09656 800 768avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
156412 0 010828 816 768avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
6121820 0 015525 824 800refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
1746145 0 010540 816 768refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
2126129 0 09336 800 768refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
2136126 0 011322 800 728refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
2178131 0 09476 792 728refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
2518138 0 010276 816 768refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55894f1b82f0: v4i64 = X86ISD::VTRUNC 0x55894f1b81c0
try.c: 0x55894f1b81c0: v16i32 = vselect 0x55894f198dc0, 0x55894f14d670, 0x55894f1b8090
try.c: 0x55894f198dc0: v4i1 = X86ISD::PCMPGTM 0x55894f1a0990, 0x55894f19c520
try.c: 0x55894f1a0990: v4i64 = X86ISD::VBROADCAST 0x55894f1478d0
try.c: 0x55894f1478d0: i64,ch = load<LD8[%lsr.iv6971]> 0x55894f0b1950, 0x55894f14aaa0, undef:i64
try.c: 0x55894f14aaa0: i64,ch = CopyFromReg 0x55894f0b1950, Register:i64 %vreg50
try.c: 0x55894f19c780: i64 = Register %vreg50
try.c: 0x55894f148da0: i64 = undef
try.c: 0x55894f19c520: v4i64,ch = CopyFromReg 0x55894f0b1950, Register:v4i64 %vreg13
try.c: 0x55894f1a11e0: v4i64 = Register %vreg13
try.c: 0x55894f14d670: v16i32 = X86ISD::VBROADCAST 0x55894f1a0bf0
try.c: 0x55894f1a0bf0: i32,ch = load<LD4[ConstantPool]> 0x55894f0b1950, 0x55894f146eb0, undef:i64
try.c: 0x55894f146eb0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55894f18bd00: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55894f148da0: i64 = undef
try.c: 0x55894f1b8090: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55894f1b7f60: i32 = Constant<0>
try.c: 0x55894f1b7f60: i32 = Constant<0>
try.c: 0x55894f1b7f60: i32 = Constant<0>
try.c: 0x55894f1b7f60: i32 = Constant<0>
try.c: 0x55894f1b7f60: i32 = Constant<0>
try.c: 0x55894f1b7f60: i32 = Constant<0>
try.c: 0x55894f1b7f60: i32 = Constant<0>
try.c: 0x55894f1b7f60: i32 = Constant<0>
try.c: 0x55894f1b7f60: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x5618111c2dd0: v4i64 = X86ISD::VTRUNC 0x5618111c2ca0
try.c: 0x5618111c2ca0: v16i32 = vselect 0x5618111bf7b0, 0x561811145010, 0x5618111c2b70
try.c: 0x5618111bf7b0: v4i1 = X86ISD::PCMPGTM 0x5618111a8fd0, 0x5618111a5770
try.c: 0x5618111a8fd0: v4i64 = X86ISD::VBROADCAST 0x5618111454d0
try.c: 0x5618111454d0: i64,ch = load<LD8[%lsr.iv6971]> 0x5618110a2a30, 0x561811154ac0, undef:i64
try.c: 0x561811154ac0: i64,ch = CopyFromReg 0x5618110a2a30, Register:i64 %vreg50
try.c: 0x5618111a59d0: i64 = Register %vreg50
try.c: 0x56181113d040: i64 = undef
try.c: 0x5618111a5770: v4i64,ch = CopyFromReg 0x5618110a2a30, Register:v4i64 %vreg13
try.c: 0x5618111a9820: v4i64 = Register %vreg13
try.c: 0x561811145010: v16i32 = X86ISD::VBROADCAST 0x5618111a9230
try.c: 0x5618111a9230: i32,ch = load<LD4[ConstantPool]> 0x5618110a2a30, 0x561811153090, undef:i64
try.c: 0x561811153090: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x56181113d9c0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x56181113d040: i64 = undef
try.c: 0x5618111c2b70: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x5618111c2a40: i32 = Constant<0>
try.c: 0x5618111c2a40: i32 = Constant<0>
try.c: 0x5618111c2a40: i32 = Constant<0>
try.c: 0x5618111c2a40: i32 = Constant<0>
try.c: 0x5618111c2a40: i32 = Constant<0>
try.c: 0x5618111c2a40: i32 = Constant<0>
try.c: 0x5618111c2a40: i32 = Constant<0>
try.c: 0x5618111c2a40: i32 = Constant<0>
try.c: 0x5618111c2a40: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x559802c84c10: v4i64 = X86ISD::VTRUNC 0x559802c84ae0
try.c: 0x559802c84ae0: v16i32 = vselect 0x559802c7f5e0, 0x559802c1ebf0, 0x559802c849b0
try.c: 0x559802c7f5e0: v4i1 = X86ISD::PCMPGTM 0x559802c79980, 0x559802c75510
try.c: 0x559802c79980: v4i64 = X86ISD::VBROADCAST 0x559802c472c0
try.c: 0x559802c472c0: i64,ch = load<LD8[%lsr.iv6971]> 0x559802b8a930, 0x559802c63640, undef:i64
try.c: 0x559802c63640: i64,ch = CopyFromReg 0x559802b8a930, Register:i64 %vreg50
try.c: 0x559802c75770: i64 = Register %vreg50
try.c: 0x559802c1d260: i64 = undef
try.c: 0x559802c75510: v4i64,ch = CopyFromReg 0x559802b8a930, Register:v4i64 %vreg13
try.c: 0x559802c7a1d0: v4i64 = Register %vreg13
try.c: 0x559802c1ebf0: v16i32 = X86ISD::VBROADCAST 0x559802c79be0
try.c: 0x559802c79be0: i32,ch = load<LD4[ConstantPool]> 0x559802b8a930, 0x559802c468a0, undef:i64
try.c: 0x559802c468a0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x559802c406a0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x559802c1d260: i64 = undef
try.c: 0x559802c849b0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x559802c84880: i32 = Constant<0>
try.c: 0x559802c84880: i32 = Constant<0>
try.c: 0x559802c84880: i32 = Constant<0>
try.c: 0x559802c84880: i32 = Constant<0>
try.c: 0x559802c84880: i32 = Constant<0>
try.c: 0x559802c84880: i32 = Constant<0>
try.c: 0x559802c84880: i32 = Constant<0>
try.c: 0x559802c84880: i32 = Constant<0>
try.c: 0x559802c84880: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
decode.c: decode.c:18:18: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i s0 = _mm256_loadu_si256((const __m256i *) s);
decode.c: ^
decode.c: decode.c:22:18: error: always_inline function '_mm256_srli_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i s1 = _mm256_srli_epi16(s0&_mm256_set1_epi8(-16),4);
decode.c: ^
decode.c: decode.c:22:39: error: always_inline function '_mm256_set1_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i s1 = _mm256_srli_epi16(s0&_mm256_set1_epi8(-16),4);
decode.c: ^
decode.c: decode.c:23:11: error: always_inline function '_mm256_set1_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: s0 &= _mm256_set1_epi8(15);
decode.c: ^
decode.c: decode.c:25:18: error: always_inline function '_mm256_unpacklo_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i a0 = _mm256_unpacklo_epi8(s0,s1);
decode.c: ^
decode.c: decode.c:28:18: error: always_inline function '_mm256_unpackhi_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i a1 = _mm256_unpackhi_epi8(s0,s1);
decode.c: ^
decode.c: decode.c:32:18: error: always_inline function '_mm256_srli_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i a2 = _mm256_srli_epi16(a0&_mm256_set1_epi8(12),2);
decode.c: ^
decode.c: decode.c:32:39: error: always_inline function '_mm256_set1_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i a2 = _mm256_srli_epi16(a0&_mm256_set1_epi8(12),2);
decode.c: ^
decode.c: decode.c:33:18: error: always_inline function '_mm256_srli_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55fdfa5cb3d0: v4i64 = X86ISD::VTRUNC 0x55fdfa5cb2a0
try.c: 0x55fdfa5cb2a0: v16i32 = vselect 0x55fdfa5b7df0, 0x55fdfa54fea0, 0x55fdfa5cb170
try.c: 0x55fdfa5b7df0: v4i1 = X86ISD::PCMPGTM 0x55fdfa5afa40, 0x55fdfa5ab5d0
try.c: 0x55fdfa5afa40: v4i64 = X86ISD::VBROADCAST 0x55fdfa558260
try.c: 0x55fdfa558260: i64,ch = load<LD8[%lsr.iv6971]> 0x55fdfa4c0950, 0x55fdfa59a270, undef:i64
try.c: 0x55fdfa59a270: i64,ch = CopyFromReg 0x55fdfa4c0950, Register:i64 %vreg50
try.c: 0x55fdfa5ab830: i64 = Register %vreg50
try.c: 0x55fdfa54e510: i64 = undef
try.c: 0x55fdfa5ab5d0: v4i64,ch = CopyFromReg 0x55fdfa4c0950, Register:v4i64 %vreg13
try.c: 0x55fdfa5b0290: v4i64 = Register %vreg13
try.c: 0x55fdfa54fea0: v16i32 = X86ISD::VBROADCAST 0x55fdfa5afca0
try.c: 0x55fdfa5afca0: i32,ch = load<LD4[ConstantPool]> 0x55fdfa4c0950, 0x55fdfa557840, undef:i64
try.c: 0x55fdfa557840: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55fdfa59b470: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55fdfa54e510: i64 = undef
try.c: 0x55fdfa5cb170: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55fdfa5cb040: i32 = Constant<0>
try.c: 0x55fdfa5cb040: i32 = Constant<0>
try.c: 0x55fdfa5cb040: i32 = Constant<0>
try.c: 0x55fdfa5cb040: i32 = Constant<0>
try.c: 0x55fdfa5cb040: i32 = Constant<0>
try.c: 0x55fdfa5cb040: i32 = Constant<0>
try.c: 0x55fdfa5cb040: i32 = Constant<0>
try.c: 0x55fdfa5cb040: i32 = Constant<0>
try.c: 0x55fdfa5cb040: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x560a19c28800: v4i64 = X86ISD::VTRUNC 0x560a19c286d0
try.c: 0x560a19c286d0: v16i32 = vselect 0x560a19c29dc0, 0x560a19ba6b10, 0x560a19c285a0
try.c: 0x560a19c29dc0: v4i1 = X86ISD::PCMPGTM 0x560a19c0ea40, 0x560a19c0b5e0
try.c: 0x560a19c0ea40: v4i64 = X86ISD::VBROADCAST 0x560a19ba6fd0
try.c: 0x560a19ba6fd0: i64,ch = load<LD8[%lsr.iv6971]> 0x560a19b08a40, 0x560a19ba94d0, undef:i64
try.c: 0x560a19ba94d0: i64,ch = CopyFromReg 0x560a19b08a40, Register:i64 %vreg50
try.c: 0x560a19c0b840: i64 = Register %vreg50
try.c: 0x560a19ba2020: i64 = undef
try.c: 0x560a19c0b5e0: v4i64,ch = CopyFromReg 0x560a19b08a40, Register:v4i64 %vreg13
try.c: 0x560a19c0f290: v4i64 = Register %vreg13
try.c: 0x560a19ba6b10: v16i32 = X86ISD::VBROADCAST 0x560a19c0eca0
try.c: 0x560a19c0eca0: i32,ch = load<LD4[ConstantPool]> 0x560a19b08a40, 0x560a19bb94a0, undef:i64
try.c: 0x560a19bb94a0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x560a19ba29a0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x560a19ba2020: i64 = undef
try.c: 0x560a19c285a0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x560a19c28470: i32 = Constant<0>
try.c: 0x560a19c28470: i32 = Constant<0>
try.c: 0x560a19c28470: i32 = Constant<0>
try.c: 0x560a19c28470: i32 = Constant<0>
try.c: 0x560a19c28470: i32 = Constant<0>
try.c: 0x560a19c28470: i32 = Constant<0>
try.c: 0x560a19c28470: i32 = Constant<0>
try.c: 0x560a19c28470: i32 = Constant<0>
try.c: 0x560a19c28470: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x563a91cf0b10: v4i64 = X86ISD::VTRUNC 0x563a91cf09e0
try.c: 0x563a91cf09e0: v16i32 = vselect 0x563a91d00ce0, 0x563a91ca4050, 0x563a91cf08b0
try.c: 0x563a91d00ce0: v4i1 = X86ISD::PCMPGTM 0x563a91ce9970, 0x563a91ce5500
try.c: 0x563a91ce9970: v4i64 = X86ISD::VBROADCAST 0x563a91c89970
try.c: 0x563a91c89970: i64,ch = load<LD8[%lsr.iv6971]> 0x563a91bfa950, 0x563a91cdc950, undef:i64
try.c: 0x563a91cdc950: i64,ch = CopyFromReg 0x563a91bfa950, Register:i64 %vreg50
try.c: 0x563a91ce5760: i64 = Register %vreg50
try.c: 0x563a91c8ae40: i64 = undef
try.c: 0x563a91ce5500: v4i64,ch = CopyFromReg 0x563a91bfa950, Register:v4i64 %vreg13
try.c: 0x563a91cea1c0: v4i64 = Register %vreg13
try.c: 0x563a91ca4050: v16i32 = X86ISD::VBROADCAST 0x563a91ce9bd0
try.c: 0x563a91ce9bd0: i32,ch = load<LD4[ConstantPool]> 0x563a91bfa950, 0x563a91c911e0, undef:i64
try.c: 0x563a91c911e0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x563a91ccebb0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x563a91c8ae40: i64 = undef
try.c: 0x563a91cf08b0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x563a91cf0780: i32 = Constant<0>
try.c: 0x563a91cf0780: i32 = Constant<0>
try.c: 0x563a91cf0780: i32 = Constant<0>
try.c: 0x563a91cf0780: i32 = Constant<0>
try.c: 0x563a91cf0780: i32 = Constant<0>
try.c: 0x563a91cf0780: i32 = Constant<0>
try.c: 0x563a91cf0780: i32 = Constant<0>
try.c: 0x563a91cf0780: i32 = Constant<0>
try.c: 0x563a91cf0780: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref