Implementation notes: amd64, cel02, crypto_encode/761x3

Computer: cel02
Architecture: amd64
CPU ID: GenuineIntel-00050657-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_encode
Primitive: 761x3
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
112465 0 09704 800 768avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
116470 0 010892 816 768avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
118470 0 014149 824 800avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
150473 0 010644 816 768avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
152437 0 09804 792 728avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
8161819 0 015525 824 800refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
868120 0 09336 800 768refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
930126 0 010260 816 768refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
1182126 0 011322 800 728refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
1256133 0 010524 816 768refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
1686119 0 09460 792 728refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x558cb761a4e0: v4i64 = X86ISD::VTRUNC 0x558cb761a3b0
try.c: 0x558cb761a3b0: v16i32 = vselect 0x558cb762f740, 0x558cb75b8dd0, 0x558cb761a280
try.c: 0x558cb762f740: v4i1 = X86ISD::PCMPGTM 0x558cb76117a0, 0x558cb760d330
try.c: 0x558cb76117a0: v4i64 = X86ISD::VBROADCAST 0x558cb75b1b20
try.c: 0x558cb75b1b20: i64,ch = load<LD8[%lsr.iv6971]> 0x558cb7522940, 0x558cb75f7fa0, undef:i64
try.c: 0x558cb75f7fa0: i64,ch = CopyFromReg 0x558cb7522940, Register:i64 %vreg50
try.c: 0x558cb760d590: i64 = Register %vreg50
try.c: 0x558cb75b7440: i64 = undef
try.c: 0x558cb760d330: v4i64,ch = CopyFromReg 0x558cb7522940, Register:v4i64 %vreg13
try.c: 0x558cb7611ff0: v4i64 = Register %vreg13
try.c: 0x558cb75b8dd0: v16i32 = X86ISD::VBROADCAST 0x558cb7611a00
try.c: 0x558cb7611a00: i32,ch = load<LD4[ConstantPool]> 0x558cb7522940, 0x558cb75b1100, undef:i64
try.c: 0x558cb75b1100: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x558cb75a10a0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x558cb75b7440: i64 = undef
try.c: 0x558cb761a280: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x558cb761a150: i32 = Constant<0>
try.c: 0x558cb761a150: i32 = Constant<0>
try.c: 0x558cb761a150: i32 = Constant<0>
try.c: 0x558cb761a150: i32 = Constant<0>
try.c: 0x558cb761a150: i32 = Constant<0>
try.c: 0x558cb761a150: i32 = Constant<0>
try.c: 0x558cb761a150: i32 = Constant<0>
try.c: 0x558cb761a150: i32 = Constant<0>
try.c: 0x558cb761a150: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x563c642178e0: v4i64 = X86ISD::VTRUNC 0x563c642177b0
try.c: 0x563c642177b0: v16i32 = vselect 0x563c642080a0, 0x563c641aee40, 0x563c64217680
try.c: 0x563c642080a0: v4i1 = X86ISD::PCMPGTM 0x563c641f3150, 0x563c641f0c70
try.c: 0x563c641f3150: v4i64 = X86ISD::VBROADCAST 0x563c641af300
try.c: 0x563c641af300: i64,ch = load<LD8[%lsr.iv6971]> 0x563c640eda30, 0x563c6418c300, undef:i64
try.c: 0x563c6418c300: i64,ch = CopyFromReg 0x563c640eda30, Register:i64 %vreg50
try.c: 0x563c641f0ed0: i64 = Register %vreg50
try.c: 0x563c64183040: i64 = undef
try.c: 0x563c641f0c70: v4i64,ch = CopyFromReg 0x563c640eda30, Register:v4i64 %vreg13
try.c: 0x563c641f39a0: v4i64 = Register %vreg13
try.c: 0x563c641aee40: v16i32 = X86ISD::VBROADCAST 0x563c641f33b0
try.c: 0x563c641f33b0: i32,ch = load<LD4[ConstantPool]> 0x563c640eda30, 0x563c641896b0, undef:i64
try.c: 0x563c641896b0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x563c641839c0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x563c64183040: i64 = undef
try.c: 0x563c64217680: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x563c64217550: i32 = Constant<0>
try.c: 0x563c64217550: i32 = Constant<0>
try.c: 0x563c64217550: i32 = Constant<0>
try.c: 0x563c64217550: i32 = Constant<0>
try.c: 0x563c64217550: i32 = Constant<0>
try.c: 0x563c64217550: i32 = Constant<0>
try.c: 0x563c64217550: i32 = Constant<0>
try.c: 0x563c64217550: i32 = Constant<0>
try.c: 0x563c64217550: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x5635b76a0dd0: v4i64 = X86ISD::VTRUNC 0x5635b76a0ca0
try.c: 0x5635b76a0ca0: v16i32 = vselect 0x5635b768d700, 0x5635b7648c90, 0x5635b76a0b70
try.c: 0x5635b768d700: v4i1 = X86ISD::PCMPGTM 0x5635b769c7b0, 0x5635b7698340
try.c: 0x5635b769c7b0: v4i64 = X86ISD::VBROADCAST 0x5635b76431d0
try.c: 0x5635b76431d0: i64,ch = load<LD8[%lsr.iv6971]> 0x5635b75ad950, 0x5635b7686af0, undef:i64
try.c: 0x5635b7686af0: i64,ch = CopyFromReg 0x5635b75ad950, Register:i64 %vreg50
try.c: 0x5635b76985a0: i64 = Register %vreg50
try.c: 0x5635b76446a0: i64 = undef
try.c: 0x5635b7698340: v4i64,ch = CopyFromReg 0x5635b75ad950, Register:v4i64 %vreg13
try.c: 0x5635b769d000: v4i64 = Register %vreg13
try.c: 0x5635b7648c90: v16i32 = X86ISD::VBROADCAST 0x5635b769ca10
try.c: 0x5635b769ca10: i32,ch = load<LD4[ConstantPool]> 0x5635b75ad950, 0x5635b762c370, undef:i64
try.c: 0x5635b762c370: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x5635b7687cf0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x5635b76446a0: i64 = undef
try.c: 0x5635b76a0b70: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x5635b76a0a40: i32 = Constant<0>
try.c: 0x5635b76a0a40: i32 = Constant<0>
try.c: 0x5635b76a0a40: i32 = Constant<0>
try.c: 0x5635b76a0a40: i32 = Constant<0>
try.c: 0x5635b76a0a40: i32 = Constant<0>
try.c: 0x5635b76a0a40: i32 = Constant<0>
try.c: 0x5635b76a0a40: i32 = Constant<0>
try.c: 0x5635b76a0a40: i32 = Constant<0>
try.c: 0x5635b76a0a40: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
encode.c: encode.c:26:18: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: __m256i f0 = _mm256_loadu_si256((const __m256i *) (f+0));
encode.c: ^
encode.c: encode.c:27:18: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: __m256i f1 = _mm256_loadu_si256((const __m256i *) (f+32));
encode.c: ^
encode.c: encode.c:28:18: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: __m256i f2 = _mm256_loadu_si256((const __m256i *) (f+64));
encode.c: ^
encode.c: encode.c:29:18: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: __m256i f3 = _mm256_loadu_si256((const __m256i *) (f+96));
encode.c: ^
encode.c: encode.c:33:18: error: always_inline function '_mm256_packus_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: __m256i a0 = _mm256_packus_epi16(f0&lobytes,f1&lobytes);
encode.c: ^
encode.c: encode.c:36:18: error: always_inline function '_mm256_packus_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: __m256i a1 = _mm256_packus_epi16(_mm256_srli_epi16(f0,8),_mm256_srli_epi16(f1,8));
encode.c: ^
encode.c: encode.c:36:38: error: always_inline function '_mm256_srli_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: __m256i a1 = _mm256_packus_epi16(_mm256_srli_epi16(f0,8),_mm256_srli_epi16(f1,8));
encode.c: ^
encode.c: encode.c:36:62: error: always_inline function '_mm256_srli_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: __m256i a1 = _mm256_packus_epi16(_mm256_srli_epi16(f0,8),_mm256_srli_epi16(f1,8));
encode.c: ^
encode.c: encode.c:38:18: error: always_inline function '_mm256_packus_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x3_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55c9bfbb8aa0: v4i64 = X86ISD::VTRUNC 0x55c9bfbb8970
try.c: 0x55c9bfbb8970: v16i32 = vselect 0x55c9bfbb4100, 0x55c9bfb57580, 0x55c9bfbb8840
try.c: 0x55c9bfbb4100: v4i1 = X86ISD::PCMPGTM 0x55c9bfbb0910, 0x55c9bfbac4a0
try.c: 0x55c9bfbb0910: v4i64 = X86ISD::VBROADCAST 0x55c9bfb50ab0
try.c: 0x55c9bfb50ab0: i64,ch = load<LD8[%lsr.iv6971]> 0x55c9bfac1950, 0x55c9bfb95d90, undef:i64
try.c: 0x55c9bfb95d90: i64,ch = CopyFromReg 0x55c9bfac1950, Register:i64 %vreg50
try.c: 0x55c9bfbac700: i64 = Register %vreg50
try.c: 0x55c9bfb51f80: i64 = undef
try.c: 0x55c9bfbac4a0: v4i64,ch = CopyFromReg 0x55c9bfac1950, Register:v4i64 %vreg13
try.c: 0x55c9bfbb1160: v4i64 = Register %vreg13
try.c: 0x55c9bfb57580: v16i32 = X86ISD::VBROADCAST 0x55c9bfbb0b70
try.c: 0x55c9bfbb0b70: i32,ch = load<LD4[ConstantPool]> 0x55c9bfac1950, 0x55c9bfb50090, undef:i64
try.c: 0x55c9bfb50090: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55c9bfb9ba70: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55c9bfb51f80: i64 = undef
try.c: 0x55c9bfbb8840: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55c9bfbb8710: i32 = Constant<0>
try.c: 0x55c9bfbb8710: i32 = Constant<0>
try.c: 0x55c9bfbb8710: i32 = Constant<0>
try.c: 0x55c9bfbb8710: i32 = Constant<0>
try.c: 0x55c9bfbb8710: i32 = Constant<0>
try.c: 0x55c9bfbb8710: i32 = Constant<0>
try.c: 0x55c9bfbb8710: i32 = Constant<0>
try.c: 0x55c9bfbb8710: i32 = Constant<0>
try.c: 0x55c9bfbb8710: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x561d44396ee0: v4i64 = X86ISD::VTRUNC 0x561d44396db0
try.c: 0x561d44396db0: v16i32 = vselect 0x561d44360040, 0x561d4431c7b0, 0x561d44396c80
try.c: 0x561d44360040: v4i1 = X86ISD::PCMPGTM 0x561d4437df00, 0x561d44379480
try.c: 0x561d4437df00: v4i64 = X86ISD::VBROADCAST 0x561d4431cc70
try.c: 0x561d4431cc70: i64,ch = load<LD8[%lsr.iv6971]> 0x561d44277a40, 0x561d443164e0, undef:i64
try.c: 0x561d443164e0: i64,ch = CopyFromReg 0x561d44277a40, Register:i64 %vreg50
try.c: 0x561d443796e0: i64 = Register %vreg50
try.c: 0x561d44326e20: i64 = undef
try.c: 0x561d44379480: v4i64,ch = CopyFromReg 0x561d44277a40, Register:v4i64 %vreg13
try.c: 0x561d4437e750: v4i64 = Register %vreg13
try.c: 0x561d4431c7b0: v16i32 = X86ISD::VBROADCAST 0x561d4437e160
try.c: 0x561d4437e160: i32,ch = load<LD4[ConstantPool]> 0x561d44277a40, 0x561d4433f610, undef:i64
try.c: 0x561d4433f610: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x561d443277a0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x561d44326e20: i64 = undef
try.c: 0x561d44396c80: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x561d44396b50: i32 = Constant<0>
try.c: 0x561d44396b50: i32 = Constant<0>
try.c: 0x561d44396b50: i32 = Constant<0>
try.c: 0x561d44396b50: i32 = Constant<0>
try.c: 0x561d44396b50: i32 = Constant<0>
try.c: 0x561d44396b50: i32 = Constant<0>
try.c: 0x561d44396b50: i32 = Constant<0>
try.c: 0x561d44396b50: i32 = Constant<0>
try.c: 0x561d44396b50: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x561542ce5320: v4i64 = X86ISD::VTRUNC 0x561542ce51f0
try.c: 0x561542ce51f0: v16i32 = vselect 0x561542ce1d00, 0x561542c91510, 0x561542ce50c0
try.c: 0x561542ce1d00: v4i1 = X86ISD::PCMPGTM 0x561542cca990, 0x561542cc6520
try.c: 0x561542cca990: v4i64 = X86ISD::VBROADCAST 0x561542c6da70
try.c: 0x561542c6da70: i64,ch = load<LD8[%lsr.iv6971]> 0x561542bdb950, 0x561542cc1380, undef:i64
try.c: 0x561542cc1380: i64,ch = CopyFromReg 0x561542bdb950, Register:i64 %vreg50
try.c: 0x561542cc6780: i64 = Register %vreg50
try.c: 0x561542c6ef40: i64 = undef
try.c: 0x561542cc6520: v4i64,ch = CopyFromReg 0x561542bdb950, Register:v4i64 %vreg13
try.c: 0x561542ccb1e0: v4i64 = Register %vreg13
try.c: 0x561542c91510: v16i32 = X86ISD::VBROADCAST 0x561542ccabf0
try.c: 0x561542ccabf0: i32,ch = load<LD4[ConstantPool]> 0x561542bdb950, 0x561542c659e0, undef:i64
try.c: 0x561542c659e0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x561542c79ff0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x561542c6ef40: i64 = undef
try.c: 0x561542ce50c0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x561542ce4f90: i32 = Constant<0>
try.c: 0x561542ce4f90: i32 = Constant<0>
try.c: 0x561542ce4f90: i32 = Constant<0>
try.c: 0x561542ce4f90: i32 = Constant<0>
try.c: 0x561542ce4f90: i32 = Constant<0>
try.c: 0x561542ce4f90: i32 = Constant<0>
try.c: 0x561542ce4f90: i32 = Constant<0>
try.c: 0x561542ce4f90: i32 = Constant<0>
try.c: 0x561542ce4f90: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref