Implementation notes: amd64, cel02, crypto_encode/761x1531round

Computer: cel02
Architecture: amd64
CPU ID: GenuineIntel-00050657-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_encode
Primitive: 761x1531round
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
8622145 0 015821 824 800avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
960576 0 016341 824 800refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
10281535 0 010816 800 768avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
11181630 0 012052 816 768avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
14501558 0 010948 792 728avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
2360112 0 011564 792 728refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
24421700 0 011868 816 768avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
4068116 0 012332 816 768refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
4214117 0 011456 800 768refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
4720393 0 013682 800 728refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
4824129 0 012604 816 768refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55d2f257a0c0: v4i64 = X86ISD::VTRUNC 0x55d2f2579f90
try.c: 0x55d2f2579f90: v16i32 = vselect 0x55d2f25823f0, 0x55d2f251a5f0, 0x55d2f2579e60
try.c: 0x55d2f25823f0: v4i1 = X86ISD::PCMPGTM 0x55d2f2576ab0, 0x55d2f2572640
try.c: 0x55d2f2576ab0: v4i64 = X86ISD::VBROADCAST 0x55d2f253aec0
try.c: 0x55d2f253aec0: i64,ch = load<LD8[%lsr.iv6971]> 0x55d2f24879a0, 0x55d2f256d4a0, undef:i64
try.c: 0x55d2f256d4a0: i64,ch = CopyFromReg 0x55d2f24879a0, Register:i64 %vreg50
try.c: 0x55d2f25728a0: i64 = Register %vreg50
try.c: 0x55d2f253c390: i64 = undef
try.c: 0x55d2f2572640: v4i64,ch = CopyFromReg 0x55d2f24879a0, Register:v4i64 %vreg13
try.c: 0x55d2f2577300: v4i64 = Register %vreg13
try.c: 0x55d2f251a5f0: v16i32 = X86ISD::VBROADCAST 0x55d2f2576d10
try.c: 0x55d2f2576d10: i32,ch = load<LD4[ConstantPool]> 0x55d2f24879a0, 0x55d2f2525bb0, undef:i64
try.c: 0x55d2f2525bb0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55d2f2544430: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55d2f253c390: i64 = undef
try.c: 0x55d2f2579e60: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55d2f2579d30: i32 = Constant<0>
try.c: 0x55d2f2579d30: i32 = Constant<0>
try.c: 0x55d2f2579d30: i32 = Constant<0>
try.c: 0x55d2f2579d30: i32 = Constant<0>
try.c: 0x55d2f2579d30: i32 = Constant<0>
try.c: 0x55d2f2579d30: i32 = Constant<0>
try.c: 0x55d2f2579d30: i32 = Constant<0>
try.c: 0x55d2f2579d30: i32 = Constant<0>
try.c: 0x55d2f2579d30: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55f7a23ca2d0: v4i64 = X86ISD::VTRUNC 0x55f7a23ca1a0
try.c: 0x55f7a23ca1a0: v16i32 = vselect 0x55f7a23c4ca0, 0x55f7a235d000, 0x55f7a23ca070
try.c: 0x55f7a23c4ca0: v4i1 = X86ISD::PCMPGTM 0x55f7a23c24c0, 0x55f7a23be290
try.c: 0x55f7a23c24c0: v4i64 = X86ISD::VBROADCAST 0x55f7a235d4c0
try.c: 0x55f7a235d4c0: i64,ch = load<LD8[%lsr.iv6971]> 0x55f7a22bca30, 0x55f7a2370250, undef:i64
try.c: 0x55f7a2370250: i64,ch = CopyFromReg 0x55f7a22bca30, Register:i64 %vreg50
try.c: 0x55f7a23be4f0: i64 = Register %vreg50
try.c: 0x55f7a235b1a0: i64 = undef
try.c: 0x55f7a23be290: v4i64,ch = CopyFromReg 0x55f7a22bca30, Register:v4i64 %vreg13
try.c: 0x55f7a23c2d10: v4i64 = Register %vreg13
try.c: 0x55f7a235d000: v16i32 = X86ISD::VBROADCAST 0x55f7a23c2720
try.c: 0x55f7a23c2720: i32,ch = load<LD4[ConstantPool]> 0x55f7a22bca30, 0x55f7a235f9a0, undef:i64
try.c: 0x55f7a235f9a0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55f7a235bb20: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55f7a235b1a0: i64 = undef
try.c: 0x55f7a23ca070: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55f7a23c9f40: i32 = Constant<0>
try.c: 0x55f7a23c9f40: i32 = Constant<0>
try.c: 0x55f7a23c9f40: i32 = Constant<0>
try.c: 0x55f7a23c9f40: i32 = Constant<0>
try.c: 0x55f7a23c9f40: i32 = Constant<0>
try.c: 0x55f7a23c9f40: i32 = Constant<0>
try.c: 0x55f7a23c9f40: i32 = Constant<0>
try.c: 0x55f7a23c9f40: i32 = Constant<0>
try.c: 0x55f7a23c9f40: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x5598fd730ec0: v4i64 = X86ISD::VTRUNC 0x5598fd730d90
try.c: 0x5598fd730d90: v16i32 = vselect 0x5598fd7496e0, 0x5598fd6d2d50, 0x5598fd730c60
try.c: 0x5598fd7496e0: v4i1 = X86ISD::PCMPGTM 0x5598fd72b890, 0x5598fd727420
try.c: 0x5598fd72b890: v4i64 = X86ISD::VBROADCAST 0x5598fd6e15b0
try.c: 0x5598fd6e15b0: i64,ch = load<LD8[%lsr.iv6971]> 0x5598fd63c920, 0x5598fd712980, undef:i64
try.c: 0x5598fd712980: i64,ch = CopyFromReg 0x5598fd63c920, Register:i64 %vreg50
try.c: 0x5598fd727680: i64 = Register %vreg50
try.c: 0x5598fd6e2a80: i64 = undef
try.c: 0x5598fd727420: v4i64,ch = CopyFromReg 0x5598fd63c920, Register:v4i64 %vreg13
try.c: 0x5598fd72c0e0: v4i64 = Register %vreg13
try.c: 0x5598fd6d2d50: v16i32 = X86ISD::VBROADCAST 0x5598fd72baf0
try.c: 0x5598fd72baf0: i32,ch = load<LD4[ConstantPool]> 0x5598fd63c920, 0x5598fd6cc260, undef:i64
try.c: 0x5598fd6cc260: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x5598fd71bbf0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x5598fd6e2a80: i64 = undef
try.c: 0x5598fd730c60: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x5598fd730b30: i32 = Constant<0>
try.c: 0x5598fd730b30: i32 = Constant<0>
try.c: 0x5598fd730b30: i32 = Constant<0>
try.c: 0x5598fd730b30: i32 = Constant<0>
try.c: 0x5598fd730b30: i32 = Constant<0>
try.c: 0x5598fd730b30: i32 = Constant<0>
try.c: 0x5598fd730b30: i32 = Constant<0>
try.c: 0x5598fd730b30: i32 = Constant<0>
try.c: 0x5598fd730b30: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
encode.c: encode.c:34:9: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x1531round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: x = _mm256_loadu_si256((__m256i *) reading);
encode.c: ^
encode.c: encode.c:35:9: error: always_inline function '_mm256_mulhrs_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x1531round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: x = _mm256_mulhrs_epi16(x,_mm256_set1_epi16(10923));
encode.c: ^
encode.c: encode.c:35:31: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x1531round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: x = _mm256_mulhrs_epi16(x,_mm256_set1_epi16(10923));
encode.c: ^
encode.c: encode.c:36:9: error: always_inline function '_mm256_add_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x1531round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: x = _mm256_add_epi16(x,_mm256_add_epi16(x,x));
encode.c: ^
encode.c: encode.c:36:28: error: always_inline function '_mm256_add_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x1531round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: x = _mm256_add_epi16(x,_mm256_add_epi16(x,x));
encode.c: ^
encode.c: encode.c:37:9: error: always_inline function '_mm256_add_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x1531round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: x = _mm256_add_epi16(x,_mm256_set1_epi16(2295));
encode.c: ^
encode.c: encode.c:37:28: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x1531round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: x = _mm256_add_epi16(x,_mm256_set1_epi16(2295));
encode.c: ^
encode.c: encode.c:38:10: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x1531round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: x &= _mm256_set1_epi16(16383);
encode.c: ^
encode.c: encode.c:39:9: error: always_inline function '_mm256_mulhi_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_encode_761x1531round_avx_constbranchindex' that is compiled without support for 'sse4.2'
encode.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x5587936871d0: v4i64 = X86ISD::VTRUNC 0x5587936870a0
try.c: 0x5587936870a0: v16i32 = vselect 0x558793667ca0, 0x558793628a50, 0x558793686f70
try.c: 0x558793667ca0: v4i1 = X86ISD::PCMPGTM 0x55879366f870, 0x55879366b400
try.c: 0x55879366f870: v4i64 = X86ISD::VBROADCAST 0x558793625bf0
try.c: 0x558793625bf0: i64,ch = load<LD8[%lsr.iv6971]> 0x558793580960, 0x55879365ab30, undef:i64
try.c: 0x55879365ab30: i64,ch = CopyFromReg 0x558793580960, Register:i64 %vreg50
try.c: 0x55879366b660: i64 = Register %vreg50
try.c: 0x5587936270c0: i64 = undef
try.c: 0x55879366b400: v4i64,ch = CopyFromReg 0x558793580960, Register:v4i64 %vreg13
try.c: 0x5587936700c0: v4i64 = Register %vreg13
try.c: 0x558793628a50: v16i32 = X86ISD::VBROADCAST 0x55879366fad0
try.c: 0x55879366fad0: i32,ch = load<LD4[ConstantPool]> 0x558793580960, 0x5587936251d0, undef:i64
try.c: 0x5587936251d0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x558793654970: i64 = TargetConstantPool<i32 1> 0
try.c: 0x5587936270c0: i64 = undef
try.c: 0x558793686f70: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x558793686e40: i32 = Constant<0>
try.c: 0x558793686e40: i32 = Constant<0>
try.c: 0x558793686e40: i32 = Constant<0>
try.c: 0x558793686e40: i32 = Constant<0>
try.c: 0x558793686e40: i32 = Constant<0>
try.c: 0x558793686e40: i32 = Constant<0>
try.c: 0x558793686e40: i32 = Constant<0>
try.c: 0x558793686e40: i32 = Constant<0>
try.c: 0x558793686e40: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55b079cf52c0: v4i64 = X86ISD::VTRUNC 0x55b079cf5190
try.c: 0x55b079cf5190: v16i32 = vselect 0x55b079cee890, 0x55b079c63fe0, 0x55b079cf5060
try.c: 0x55b079cee890: v4i1 = X86ISD::PCMPGTM 0x55b079ccfb00, 0x55b079ccc6a0
try.c: 0x55b079ccfb00: v4i64 = X86ISD::VBROADCAST 0x55b079c644a0
try.c: 0x55b079c644a0: i64,ch = load<LD8[%lsr.iv6971]> 0x55b079bc9a30, 0x55b079c8f870, undef:i64
try.c: 0x55b079c8f870: i64,ch = CopyFromReg 0x55b079bc9a30, Register:i64 %vreg50
try.c: 0x55b079ccc900: i64 = Register %vreg50
try.c: 0x55b079c66e90: i64 = undef
try.c: 0x55b079ccc6a0: v4i64,ch = CopyFromReg 0x55b079bc9a30, Register:v4i64 %vreg13
try.c: 0x55b079cd0350: v4i64 = Register %vreg13
try.c: 0x55b079c63fe0: v16i32 = X86ISD::VBROADCAST 0x55b079ccfd60
try.c: 0x55b079ccfd60: i32,ch = load<LD4[ConstantPool]> 0x55b079bc9a30, 0x55b079c8d630, undef:i64
try.c: 0x55b079c8d630: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55b079c67810: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55b079c66e90: i64 = undef
try.c: 0x55b079cf5060: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55b079cf4f30: i32 = Constant<0>
try.c: 0x55b079cf4f30: i32 = Constant<0>
try.c: 0x55b079cf4f30: i32 = Constant<0>
try.c: 0x55b079cf4f30: i32 = Constant<0>
try.c: 0x55b079cf4f30: i32 = Constant<0>
try.c: 0x55b079cf4f30: i32 = Constant<0>
try.c: 0x55b079cf4f30: i32 = Constant<0>
try.c: 0x55b079cf4f30: i32 = Constant<0>
try.c: 0x55b079cf4f30: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x5574f854d7f0: v4i64 = X86ISD::VTRUNC 0x5574f854d6c0
try.c: 0x5574f854d6c0: v16i32 = vselect 0x5574f8567ea0, 0x5574f84eea10, 0x5574f854d590
try.c: 0x5574f8567ea0: v4i1 = X86ISD::PCMPGTM 0x5574f8546b40, 0x5574f85426d0
try.c: 0x5574f8546b40: v4i64 = X86ISD::VBROADCAST 0x5574f84f1090
try.c: 0x5574f84f1090: i64,ch = load<LD8[%lsr.iv6971]> 0x5574f8457920, 0x5574f8531e30, undef:i64
try.c: 0x5574f8531e30: i64,ch = CopyFromReg 0x5574f8457920, Register:i64 %vreg50
try.c: 0x5574f8542930: i64 = Register %vreg50
try.c: 0x5574f84ed080: i64 = undef
try.c: 0x5574f85426d0: v4i64,ch = CopyFromReg 0x5574f8457920, Register:v4i64 %vreg13
try.c: 0x5574f8547390: v4i64 = Register %vreg13
try.c: 0x5574f84eea10: v16i32 = X86ISD::VBROADCAST 0x5574f8546da0
try.c: 0x5574f8546da0: i32,ch = load<LD4[ConstantPool]> 0x5574f8457920, 0x5574f84f0670, undef:i64
try.c: 0x5574f84f0670: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x5574f84d23f0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x5574f84ed080: i64 = undef
try.c: 0x5574f854d590: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x5574f854d460: i32 = Constant<0>
try.c: 0x5574f854d460: i32 = Constant<0>
try.c: 0x5574f854d460: i32 = Constant<0>
try.c: 0x5574f854d460: i32 = Constant<0>
try.c: 0x5574f854d460: i32 = Constant<0>
try.c: 0x5574f854d460: i32 = Constant<0>
try.c: 0x5574f854d460: i32 = Constant<0>
try.c: 0x5574f854d460: i32 = Constant<0>
try.c: 0x5574f854d460: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref