Implementation notes: amd64, cel02, crypto_decode/256x2

Computer: cel02
Architecture: amd64
CPU ID: GenuineIntel-00050657-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_decode
Primitive: 256x2
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
50218 0 010636 816 768avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
60179 0 09540 792 728avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
64217 0 010372 816 768avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
66407 0 014085 824 800avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
86217 0 09464 800 768avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
814130 0 011786 800 728refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
82095 0 010228 816 768refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
97498 0 013781 824 800refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
122490 0 09304 800 768refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
143291 0 09428 792 728refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
200498 0 010492 816 768refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55d91ee89d40: v4i64 = X86ISD::VTRUNC 0x55d91ee89c10
try.c: 0x55d91ee89c10: v16i32 = vselect 0x55d91ee86720, 0x55d91ee223d0, 0x55d91ee89ae0
try.c: 0x55d91ee86720: v4i1 = X86ISD::PCMPGTM 0x55d91ee81ad0, 0x55d91ee7d660
try.c: 0x55d91ee81ad0: v4i64 = X86ISD::VBROADCAST 0x55d91ee3ba90
try.c: 0x55d91ee3ba90: i64,ch = load<LD8[%lsr.iv6971]> 0x55d91ed92950, 0x55d91ee6ccd0, undef:i64
try.c: 0x55d91ee6ccd0: i64,ch = CopyFromReg 0x55d91ed92950, Register:i64 %vreg50
try.c: 0x55d91ee7d8c0: i64 = Register %vreg50
try.c: 0x55d91ee20a40: i64 = undef
try.c: 0x55d91ee7d660: v4i64,ch = CopyFromReg 0x55d91ed92950, Register:v4i64 %vreg13
try.c: 0x55d91ee82320: v4i64 = Register %vreg13
try.c: 0x55d91ee223d0: v16i32 = X86ISD::VBROADCAST 0x55d91ee81d30
try.c: 0x55d91ee81d30: i32,ch = load<LD4[ConstantPool]> 0x55d91ed92950, 0x55d91ee3b070, undef:i64
try.c: 0x55d91ee3b070: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55d91ee1cb30: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55d91ee20a40: i64 = undef
try.c: 0x55d91ee89ae0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55d91ee899b0: i32 = Constant<0>
try.c: 0x55d91ee899b0: i32 = Constant<0>
try.c: 0x55d91ee899b0: i32 = Constant<0>
try.c: 0x55d91ee899b0: i32 = Constant<0>
try.c: 0x55d91ee899b0: i32 = Constant<0>
try.c: 0x55d91ee899b0: i32 = Constant<0>
try.c: 0x55d91ee899b0: i32 = Constant<0>
try.c: 0x55d91ee899b0: i32 = Constant<0>
try.c: 0x55d91ee899b0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x560a3af6f520: v4i64 = X86ISD::VTRUNC 0x560a3af6f3f0
try.c: 0x560a3af6f3f0: v16i32 = vselect 0x560a3af6bf00, 0x560a3aeeae70, 0x560a3af6f2c0
try.c: 0x560a3af6bf00: v4i1 = X86ISD::PCMPGTM 0x560a3af567b0, 0x560a3af52340
try.c: 0x560a3af567b0: v4i64 = X86ISD::VBROADCAST 0x560a3aeeb330
try.c: 0x560a3aeeb330: i64,ch = load<LD8[%lsr.iv6971]> 0x560a3ae4fa10, 0x560a3aef7c10, undef:i64
try.c: 0x560a3aef7c10: i64,ch = CopyFromReg 0x560a3ae4fa10, Register:i64 %vreg50
try.c: 0x560a3af525a0: i64 = Register %vreg50
try.c: 0x560a3aed4030: i64 = undef
try.c: 0x560a3af52340: v4i64,ch = CopyFromReg 0x560a3ae4fa10, Register:v4i64 %vreg13
try.c: 0x560a3af57000: v4i64 = Register %vreg13
try.c: 0x560a3aeeae70: v16i32 = X86ISD::VBROADCAST 0x560a3af56a10
try.c: 0x560a3af56a10: i32,ch = load<LD4[ConstantPool]> 0x560a3ae4fa10, 0x560a3aef61e0, undef:i64
try.c: 0x560a3aef61e0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x560a3aed49b0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x560a3aed4030: i64 = undef
try.c: 0x560a3af6f2c0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x560a3af6f190: i32 = Constant<0>
try.c: 0x560a3af6f190: i32 = Constant<0>
try.c: 0x560a3af6f190: i32 = Constant<0>
try.c: 0x560a3af6f190: i32 = Constant<0>
try.c: 0x560a3af6f190: i32 = Constant<0>
try.c: 0x560a3af6f190: i32 = Constant<0>
try.c: 0x560a3af6f190: i32 = Constant<0>
try.c: 0x560a3af6f190: i32 = Constant<0>
try.c: 0x560a3af6f190: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55e6f5f5d2c0: v4i64 = X86ISD::VTRUNC 0x55e6f5f5d190
try.c: 0x55e6f5f5d190: v16i32 = vselect 0x55e6f5f64930, 0x55e6f5ee5840, 0x55e6f5f5d060
try.c: 0x55e6f5f64930: v4i1 = X86ISD::PCMPGTM 0x55e6f5f45960, 0x55e6f5f414f0
try.c: 0x55e6f5f45960: v4i64 = X86ISD::VBROADCAST 0x55e6f5eec7e0
try.c: 0x55e6f5eec7e0: i64,ch = load<LD8[%lsr.iv6971]> 0x55e6f5e56950, 0x55e6f5f384b0, undef:i64
try.c: 0x55e6f5f384b0: i64,ch = CopyFromReg 0x55e6f5e56950, Register:i64 %vreg50
try.c: 0x55e6f5f41750: i64 = Register %vreg50
try.c: 0x55e6f5eedcb0: i64 = undef
try.c: 0x55e6f5f414f0: v4i64,ch = CopyFromReg 0x55e6f5e56950, Register:v4i64 %vreg13
try.c: 0x55e6f5f461b0: v4i64 = Register %vreg13
try.c: 0x55e6f5ee5840: v16i32 = X86ISD::VBROADCAST 0x55e6f5f45bc0
try.c: 0x55e6f5f45bc0: i32,ch = load<LD4[ConstantPool]> 0x55e6f5e56950, 0x55e6f5eebdc0, undef:i64
try.c: 0x55e6f5eebdc0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55e6f5f25da0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55e6f5eedcb0: i64 = undef
try.c: 0x55e6f5f5d060: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55e6f5f5cf30: i32 = Constant<0>
try.c: 0x55e6f5f5cf30: i32 = Constant<0>
try.c: 0x55e6f5f5cf30: i32 = Constant<0>
try.c: 0x55e6f5f5cf30: i32 = Constant<0>
try.c: 0x55e6f5f5cf30: i32 = Constant<0>
try.c: 0x55e6f5f5cf30: i32 = Constant<0>
try.c: 0x55e6f5f5cf30: i32 = Constant<0>
try.c: 0x55e6f5f5cf30: i32 = Constant<0>
try.c: 0x55e6f5f5cf30: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
decode.c: decode.c:16:17: error: always_inline function '_mm256_set1_epi32' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_256x2_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: __m256i x = _mm256_set1_epi32(*(int32_t *) s);
decode.c: ^
decode.c: decode.c:18:9: error: always_inline function '_mm256_shuffle_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_256x2_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: x = _mm256_shuffle_epi8(x,COPY);
decode.c: ^
decode.c: decode.c:18:31: error: always_inline function '_mm256_set_epi64x' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_256x2_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: x = _mm256_shuffle_epi8(x,COPY);
decode.c: ^
decode.c: decode.c:5:14: note: expanded from macro 'COPY'
decode.c: #define COPY _mm256_set_epi64x(0x0303030303030303,0x0202020202020202,0x0101010101010101,0x0000000000000000)
decode.c: ^
decode.c: decode.c:20:9: error: always_inline function '_mm256_andnot_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_256x2_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: x = _mm256_andnot_si256(x,MASK);
decode.c: ^
decode.c: decode.c:20:31: error: always_inline function '_mm256_set1_epi64x' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_256x2_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: x = _mm256_andnot_si256(x,MASK);
decode.c: ^
decode.c: decode.c:6:14: note: expanded from macro 'MASK'
decode.c: #define MASK _mm256_set1_epi64x(0x8040201008040201)
decode.c: ^
decode.c: decode.c:21:9: error: always_inline function '_mm256_cmpeq_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_256x2_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: x = _mm256_cmpeq_epi8(x,_mm256_setzero_si256());
decode.c: ^
decode.c: decode.c:21:29: error: always_inline function '_mm256_setzero_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_decode_256x2_avx_constbranchindex' that is compiled without support for 'sse4.2'
decode.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x561971208740: v4i64 = X86ISD::VTRUNC 0x561971208610
try.c: 0x561971208610: v16i32 = vselect 0x5619711f9d00, 0x5619711ab2d0, 0x5619712084e0
try.c: 0x5619711f9d00: v4i1 = X86ISD::PCMPGTM 0x5619712038f0, 0x5619711ff480
try.c: 0x5619712038f0: v4i64 = X86ISD::VBROADCAST 0x5619711a46b0
try.c: 0x5619711a46b0: i64,ch = load<LD8[%lsr.iv6971]> 0x5619711149a0, 0x5619711f6910, undef:i64
try.c: 0x5619711f6910: i64,ch = CopyFromReg 0x5619711149a0, Register:i64 %vreg50
try.c: 0x5619711ff6e0: i64 = Register %vreg50
try.c: 0x5619711a9940: i64 = undef
try.c: 0x5619711ff480: v4i64,ch = CopyFromReg 0x5619711149a0, Register:v4i64 %vreg13
try.c: 0x561971204140: v4i64 = Register %vreg13
try.c: 0x5619711ab2d0: v16i32 = X86ISD::VBROADCAST 0x561971203b50
try.c: 0x561971203b50: i32,ch = load<LD4[ConstantPool]> 0x5619711149a0, 0x5619711a3c90, undef:i64
try.c: 0x5619711a3c90: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x5619711ed710: i64 = TargetConstantPool<i32 1> 0
try.c: 0x5619711a9940: i64 = undef
try.c: 0x5619712084e0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x5619712083b0: i32 = Constant<0>
try.c: 0x5619712083b0: i32 = Constant<0>
try.c: 0x5619712083b0: i32 = Constant<0>
try.c: 0x5619712083b0: i32 = Constant<0>
try.c: 0x5619712083b0: i32 = Constant<0>
try.c: 0x5619712083b0: i32 = Constant<0>
try.c: 0x5619712083b0: i32 = Constant<0>
try.c: 0x5619712083b0: i32 = Constant<0>
try.c: 0x5619712083b0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55ea2dd77de0: v4i64 = X86ISD::VTRUNC 0x55ea2dd77cb0
try.c: 0x55ea2dd77cb0: v16i32 = vselect 0x55ea2dd747c0, 0x55ea2dd083b0, 0x55ea2dd77b80
try.c: 0x55ea2dd747c0: v4i1 = X86ISD::PCMPGTM 0x55ea2dd6ff90, 0x55ea2dd6bb20
try.c: 0x55ea2dd6ff90: v4i64 = X86ISD::VBROADCAST 0x55ea2dd08870
try.c: 0x55ea2dd08870: i64,ch = load<LD8[%lsr.iv6971]> 0x55ea2dc69a30, 0x55ea2dd1e310, undef:i64
try.c: 0x55ea2dd1e310: i64,ch = CopyFromReg 0x55ea2dc69a30, Register:i64 %vreg50
try.c: 0x55ea2dd6bd80: i64 = Register %vreg50
try.c: 0x55ea2dd10360: i64 = undef
try.c: 0x55ea2dd6bb20: v4i64,ch = CopyFromReg 0x55ea2dc69a30, Register:v4i64 %vreg13
try.c: 0x55ea2dd707e0: v4i64 = Register %vreg13
try.c: 0x55ea2dd083b0: v16i32 = X86ISD::VBROADCAST 0x55ea2dd701f0
try.c: 0x55ea2dd701f0: i32,ch = load<LD4[ConstantPool]> 0x55ea2dc69a30, 0x55ea2dd0bec0, undef:i64
try.c: 0x55ea2dd0bec0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55ea2dd10ce0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55ea2dd10360: i64 = undef
try.c: 0x55ea2dd77b80: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55ea2dd77a50: i32 = Constant<0>
try.c: 0x55ea2dd77a50: i32 = Constant<0>
try.c: 0x55ea2dd77a50: i32 = Constant<0>
try.c: 0x55ea2dd77a50: i32 = Constant<0>
try.c: 0x55ea2dd77a50: i32 = Constant<0>
try.c: 0x55ea2dd77a50: i32 = Constant<0>
try.c: 0x55ea2dd77a50: i32 = Constant<0>
try.c: 0x55ea2dd77a50: i32 = Constant<0>
try.c: 0x55ea2dd77a50: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x557d644e07f0: v4i64 = X86ISD::VTRUNC 0x557d644e06c0
try.c: 0x557d644e06c0: v16i32 = vselect 0x557d644cfdd0, 0x557d6447f9b0, 0x557d644e0590
try.c: 0x557d644cfdd0: v4i1 = X86ISD::PCMPGTM 0x557d644d79a0, 0x557d644d3530
try.c: 0x557d644d79a0: v4i64 = X86ISD::VBROADCAST 0x557d6447c5b0
try.c: 0x557d6447c5b0: i64,ch = load<LD8[%lsr.iv6971]> 0x557d643e8950, 0x557d644c9fc0, undef:i64
try.c: 0x557d644c9fc0: i64,ch = CopyFromReg 0x557d643e8950, Register:i64 %vreg50
try.c: 0x557d644d3790: i64 = Register %vreg50
try.c: 0x557d6447e020: i64 = undef
try.c: 0x557d644d3530: v4i64,ch = CopyFromReg 0x557d643e8950, Register:v4i64 %vreg13
try.c: 0x557d644d81f0: v4i64 = Register %vreg13
try.c: 0x557d6447f9b0: v16i32 = X86ISD::VBROADCAST 0x557d644d7c00
try.c: 0x557d644d7c00: i32,ch = load<LD4[ConstantPool]> 0x557d643e8950, 0x557d6447bb90, undef:i64
try.c: 0x557d6447bb90: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x557d6449bbe0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x557d6447e020: i64 = undef
try.c: 0x557d644e0590: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x557d644e0460: i32 = Constant<0>
try.c: 0x557d644e0460: i32 = Constant<0>
try.c: 0x557d644e0460: i32 = Constant<0>
try.c: 0x557d644e0460: i32 = Constant<0>
try.c: 0x557d644e0460: i32 = Constant<0>
try.c: 0x557d644e0460: i32 = Constant<0>
try.c: 0x557d644e0460: i32 = Constant<0>
try.c: 0x557d644e0460: i32 = Constant<0>
try.c: 0x557d644e0460: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref