Implementation notes: amd64, cel02, crypto_encode/653x1541round

Computer: cel02
Architecture: amd64
CPU ID: GenuineIntel-00050657-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_encode
Primitive: 653x1541round
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
7482214 0 015885 824 800avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
7761661 0 011820 816 768avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
7901641 0 012068 816 768avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
9301518 0 010800 800 768avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
1130930 0 016773 824 800refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
13141571 0 010964 792 728avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
2176117 0 011536 800 768refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
3412112 0 011644 792 728refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
3618129 0 012684 816 768refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
3764116 0 012412 816 768refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
4320521 0 013890 800 728refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x556b175f0270: v4i64 = X86ISD::VTRUNC 0x556b175f0140
try.c: 0x556b175f0140: v16i32 = vselect 0x556b175d1350, 0x556b17580d40, 0x556b175f0010
try.c: 0x556b175d1350: v4i1 = X86ISD::PCMPGTM 0x556b175d8910, 0x556b175d44a0
try.c: 0x556b175d8910: v4i64 = X86ISD::VBROADCAST 0x556b1757cbc0
try.c: 0x556b1757cbc0: i64,ch = load<LD8[%lsr.iv6971]> 0x556b174e9930, 0x556b175c7600, undef:i64
try.c: 0x556b175c7600: i64,ch = CopyFromReg 0x556b174e9930, Register:i64 %vreg50
try.c: 0x556b175d4700: i64 = Register %vreg50
try.c: 0x556b1757f3b0: i64 = undef
try.c: 0x556b175d44a0: v4i64,ch = CopyFromReg 0x556b174e9930, Register:v4i64 %vreg13
try.c: 0x556b175d9160: v4i64 = Register %vreg13
try.c: 0x556b17580d40: v16i32 = X86ISD::VBROADCAST 0x556b175d8b70
try.c: 0x556b175d8b70: i32,ch = load<LD4[ConstantPool]> 0x556b174e9930, 0x556b1757c1a0, undef:i64
try.c: 0x556b1757c1a0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x556b175bdd50: i64 = TargetConstantPool<i32 1> 0
try.c: 0x556b1757f3b0: i64 = undef
try.c: 0x556b175f0010: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x556b175efee0: i32 = Constant<0>
try.c: 0x556b175efee0: i32 = Constant<0>
try.c: 0x556b175efee0: i32 = Constant<0>
try.c: 0x556b175efee0: i32 = Constant<0>
try.c: 0x556b175efee0: i32 = Constant<0>
try.c: 0x556b175efee0: i32 = Constant<0>
try.c: 0x556b175efee0: i32 = Constant<0>
try.c: 0x556b175efee0: i32 = Constant<0>
try.c: 0x556b175efee0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x563446286d50: v4i64 = X86ISD::VTRUNC 0x563446286c20
try.c: 0x563446286c20: v16i32 = vselect 0x563446273df0, 0x563446203ed0, 0x563446286af0
try.c: 0x563446273df0: v4i1 = X86ISD::PCMPGTM 0x5634462705c0, 0x56344626c150
try.c: 0x5634462705c0: v4i64 = X86ISD::VBROADCAST 0x563446204390
try.c: 0x563446204390: i64,ch = load<LD8[%lsr.iv6971]> 0x563446169a20, 0x56344620ac10, undef:i64
try.c: 0x56344620ac10: i64,ch = CopyFromReg 0x563446169a20, Register:i64 %vreg50
try.c: 0x56344626c3b0: i64 = Register %vreg50
try.c: 0x56344620d9a0: i64 = undef
try.c: 0x56344626c150: v4i64,ch = CopyFromReg 0x563446169a20, Register:v4i64 %vreg13
try.c: 0x563446270e10: v4i64 = Register %vreg13
try.c: 0x563446203ed0: v16i32 = X86ISD::VBROADCAST 0x563446270820
try.c: 0x563446270820: i32,ch = load<LD4[ConstantPool]> 0x563446169a20, 0x563446207a70, undef:i64
try.c: 0x563446207a70: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x56344620e320: i64 = TargetConstantPool<i32 1> 0
try.c: 0x56344620d9a0: i64 = undef
try.c: 0x563446286af0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x5634462869c0: i32 = Constant<0>
try.c: 0x5634462869c0: i32 = Constant<0>
try.c: 0x5634462869c0: i32 = Constant<0>
try.c: 0x5634462869c0: i32 = Constant<0>
try.c: 0x5634462869c0: i32 = Constant<0>
try.c: 0x5634462869c0: i32 = Constant<0>
try.c: 0x5634462869c0: i32 = Constant<0>
try.c: 0x5634462869c0: i32 = Constant<0>
try.c: 0x5634462869c0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55f977b26810: v4i64 = X86ISD::VTRUNC 0x55f977b266e0
try.c: 0x55f977b266e0: v16i32 = vselect 0x55f977b180b0, 0x55f977acb1a0, 0x55f977b265b0
try.c: 0x55f977b180b0: v4i1 = X86ISD::PCMPGTM 0x55f977b0cea0, 0x55f977b08a30
try.c: 0x55f977b0cea0: v4i64 = X86ISD::VBROADCAST 0x55f977ab4050
try.c: 0x55f977ab4050: i64,ch = load<LD8[%lsr.iv6971]> 0x55f977a1d970, 0x55f977af2a40, undef:i64
try.c: 0x55f977af2a40: i64,ch = CopyFromReg 0x55f977a1d970, Register:i64 %vreg50
try.c: 0x55f977b08c90: i64 = Register %vreg50
try.c: 0x55f977ab5520: i64 = undef
try.c: 0x55f977b08a30: v4i64,ch = CopyFromReg 0x55f977a1d970, Register:v4i64 %vreg13
try.c: 0x55f977b0d6f0: v4i64 = Register %vreg13
try.c: 0x55f977acb1a0: v16i32 = X86ISD::VBROADCAST 0x55f977b0d100
try.c: 0x55f977b0d100: i32,ch = load<LD4[ConstantPool]> 0x55f977a1d970, 0x55f977ab3630, undef:i64
try.c: 0x55f977ab3630: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55f977affbc0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55f977ab5520: i64 = undef
try.c: 0x55f977b265b0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55f977b26480: i32 = Constant<0>
try.c: 0x55f977b26480: i32 = Constant<0>
try.c: 0x55f977b26480: i32 = Constant<0>
try.c: 0x55f977b26480: i32 = Constant<0>
try.c: 0x55f977b26480: i32 = Constant<0>
try.c: 0x55f977b26480: i32 = Constant<0>
try.c: 0x55f977b26480: i32 = Constant<0>
try.c: 0x55f977b26480: i32 = Constant<0>
try.c: 0x55f977b26480: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
encode.c: encode.c:34:9: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_653x1541round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: x = _mm256_loadu_si256((__m256i *) reading);
encode.c: ^
encode.c: encode.c:35:9: error: always_inline function '_mm256_mulhrs_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_653x1541round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: x = _mm256_mulhrs_epi16(x,_mm256_set1_epi16(10923));
encode.c: ^
encode.c: encode.c:35:31: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_653x1541round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: x = _mm256_mulhrs_epi16(x,_mm256_set1_epi16(10923));
encode.c: ^
encode.c: encode.c:36:9: error: always_inline function '_mm256_add_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_653x1541round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: x = _mm256_add_epi16(x,_mm256_add_epi16(x,x));
encode.c: ^
encode.c: encode.c:36:28: error: always_inline function '_mm256_add_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_653x1541round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: x = _mm256_add_epi16(x,_mm256_add_epi16(x,x));
encode.c: ^
encode.c: encode.c:37:9: error: always_inline function '_mm256_add_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_653x1541round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: x = _mm256_add_epi16(x,_mm256_set1_epi16(2310));
encode.c: ^
encode.c: encode.c:37:28: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_653x1541round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: x = _mm256_add_epi16(x,_mm256_set1_epi16(2310));
encode.c: ^
encode.c: encode.c:38:10: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_653x1541round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: x &= _mm256_set1_epi16(16383);
encode.c: ^
encode.c: encode.c:39:9: error: always_inline function '_mm256_mulhi_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_653x1541round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55be04f394f0: v4i64 = X86ISD::VTRUNC 0x55be04f393c0
try.c: 0x55be04f393c0: v16i32 = vselect 0x55be04f53460, 0x55be04ecfcd0, 0x55be04f39290
try.c: 0x55be04f53460: v4i1 = X86ISD::PCMPGTM 0x55be04f31290, 0x55be04f2ce20
try.c: 0x55be04f31290: v4i64 = X86ISD::VBROADCAST 0x55be04ed3af0
try.c: 0x55be04ed3af0: i64,ch = load<LD8[%lsr.iv6971]> 0x55be04e41930, 0x55be04f18ba0, undef:i64
try.c: 0x55be04f18ba0: i64,ch = CopyFromReg 0x55be04e41930, Register:i64 %vreg50
try.c: 0x55be04f2d080: i64 = Register %vreg50
try.c: 0x55be04ed4fc0: i64 = undef
try.c: 0x55be04f2ce20: v4i64,ch = CopyFromReg 0x55be04e41930, Register:v4i64 %vreg13
try.c: 0x55be04f31ae0: v4i64 = Register %vreg13
try.c: 0x55be04ecfcd0: v16i32 = X86ISD::VBROADCAST 0x55be04f314f0
try.c: 0x55be04f314f0: i32,ch = load<LD4[ConstantPool]> 0x55be04e41930, 0x55be04ed8240, undef:i64
try.c: 0x55be04ed8240: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55be04ed96a0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55be04ed4fc0: i64 = undef
try.c: 0x55be04f39290: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55be04f39160: i32 = Constant<0>
try.c: 0x55be04f39160: i32 = Constant<0>
try.c: 0x55be04f39160: i32 = Constant<0>
try.c: 0x55be04f39160: i32 = Constant<0>
try.c: 0x55be04f39160: i32 = Constant<0>
try.c: 0x55be04f39160: i32 = Constant<0>
try.c: 0x55be04f39160: i32 = Constant<0>
try.c: 0x55be04f39160: i32 = Constant<0>
try.c: 0x55be04f39160: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x56438604aa20: v4i64 = X86ISD::VTRUNC 0x56438604a8f0
try.c: 0x56438604a8f0: v16i32 = vselect 0x56438605c7b0, 0x564385fdebf0, 0x56438604a7c0
try.c: 0x56438605c7b0: v4i1 = X86ISD::PCMPGTM 0x5643860433e0, 0x56438603ef70
try.c: 0x5643860433e0: v4i64 = X86ISD::VBROADCAST 0x564385fdf0b0
try.c: 0x564385fdf0b0: i64,ch = load<LD8[%lsr.iv6971]> 0x564385f3ca30, 0x564386003600, undef:i64
try.c: 0x564386003600: i64,ch = CopyFromReg 0x564385f3ca30, Register:i64 %vreg50
try.c: 0x56438603f1d0: i64 = Register %vreg50
try.c: 0x564385fda2f0: i64 = undef
try.c: 0x56438603ef70: v4i64,ch = CopyFromReg 0x564385f3ca30, Register:v4i64 %vreg13
try.c: 0x564386043c30: v4i64 = Register %vreg13
try.c: 0x564385fdebf0: v16i32 = X86ISD::VBROADCAST 0x564386043640
try.c: 0x564386043640: i32,ch = load<LD4[ConstantPool]> 0x564385f3ca30, 0x5643860013c0, undef:i64
try.c: 0x5643860013c0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x564385fdac70: i64 = TargetConstantPool<i32 1> 0
try.c: 0x564385fda2f0: i64 = undef
try.c: 0x56438604a7c0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x56438604a690: i32 = Constant<0>
try.c: 0x56438604a690: i32 = Constant<0>
try.c: 0x56438604a690: i32 = Constant<0>
try.c: 0x56438604a690: i32 = Constant<0>
try.c: 0x56438604a690: i32 = Constant<0>
try.c: 0x56438604a690: i32 = Constant<0>
try.c: 0x56438604a690: i32 = Constant<0>
try.c: 0x56438604a690: i32 = Constant<0>
try.c: 0x56438604a690: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x564b6e338d80: v4i64 = X86ISD::VTRUNC 0x564b6e338c50
try.c: 0x564b6e338c50: v16i32 = vselect 0x564b6e32ab70, 0x564b6e2db7c0, 0x564b6e338b20
try.c: 0x564b6e32ab70: v4i1 = X86ISD::PCMPGTM 0x564b6e334760, 0x564b6e3302f0
try.c: 0x564b6e334760: v4i64 = X86ISD::VBROADCAST 0x564b6e2d4b20
try.c: 0x564b6e2d4b20: i64,ch = load<LD8[%lsr.iv6971]> 0x564b6e245950, 0x564b6e31e2d0, undef:i64
try.c: 0x564b6e31e2d0: i64,ch = CopyFromReg 0x564b6e245950, Register:i64 %vreg50
try.c: 0x564b6e330550: i64 = Register %vreg50
try.c: 0x564b6e2d5ff0: i64 = undef
try.c: 0x564b6e3302f0: v4i64,ch = CopyFromReg 0x564b6e245950, Register:v4i64 %vreg13
try.c: 0x564b6e334fb0: v4i64 = Register %vreg13
try.c: 0x564b6e2db7c0: v16i32 = X86ISD::VBROADCAST 0x564b6e3349c0
try.c: 0x564b6e3349c0: i32,ch = load<LD4[ConstantPool]> 0x564b6e245950, 0x564b6e2d4100, undef:i64
try.c: 0x564b6e2d4100: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x564b6e31f4d0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x564b6e2d5ff0: i64 = undef
try.c: 0x564b6e338b20: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x564b6e3389f0: i32 = Constant<0>
try.c: 0x564b6e3389f0: i32 = Constant<0>
try.c: 0x564b6e3389f0: i32 = Constant<0>
try.c: 0x564b6e3389f0: i32 = Constant<0>
try.c: 0x564b6e3389f0: i32 = Constant<0>
try.c: 0x564b6e3389f0: i32 = Constant<0>
try.c: 0x564b6e3389f0: i32 = Constant<0>
try.c: 0x564b6e3389f0: i32 = Constant<0>
try.c: 0x564b6e3389f0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref