Implementation notes: amd64, cel02, crypto_core/scale3sntrup857

Computer: cel02
Architecture: amd64
CPU ID: GenuineIntel-00050657-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_core
Primitive: scale3sntrup857
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
270306 0 015581 824 864avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
282272 0 010912 800 800avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
308227 0 011052 792 760avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
456306 0 012356 816 800avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
462305 0 012092 816 800avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
578607 0 015997 824 864refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
2220424 0 013210 800 760refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
4160180 0 011092 792 760refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
6386185 0 012036 816 800refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
8760181 0 010988 808 800refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
9890192 0 012300 816 800refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55c05aa60f60: v4i64 = X86ISD::VTRUNC 0x55c05aa60e30
try.c: 0x55c05aa60e30: v16i32 = vselect 0x55c05aa673f0, 0x55c05aa0aeb0, 0x55c05aa60d00
try.c: 0x55c05aa673f0: v4i1 = X86ISD::PCMPGTM 0x55c05aa5b930, 0x55c05aa574c0
try.c: 0x55c05aa5b930: v4i64 = X86ISD::VBROADCAST 0x55c05a9fc810
try.c: 0x55c05a9fc810: i64,ch = load<LD8[%lsr.iv6971]> 0x55c05a96c960, 0x55c05aa4e960, undef:i64
try.c: 0x55c05aa4e960: i64,ch = CopyFromReg 0x55c05a96c960, Register:i64 %vreg50
try.c: 0x55c05aa57720: i64 = Register %vreg50
try.c: 0x55c05aa09520: i64 = undef
try.c: 0x55c05aa574c0: v4i64,ch = CopyFromReg 0x55c05a96c960, Register:v4i64 %vreg13
try.c: 0x55c05aa5c180: v4i64 = Register %vreg13
try.c: 0x55c05aa0aeb0: v16i32 = X86ISD::VBROADCAST 0x55c05aa5bb90
try.c: 0x55c05aa5bb90: i32,ch = load<LD4[ConstantPool]> 0x55c05a96c960, 0x55c05a9fbdf0, undef:i64
try.c: 0x55c05a9fbdf0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55c05aa44f30: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55c05aa09520: i64 = undef
try.c: 0x55c05aa60d00: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55c05aa60bd0: i32 = Constant<0>
try.c: 0x55c05aa60bd0: i32 = Constant<0>
try.c: 0x55c05aa60bd0: i32 = Constant<0>
try.c: 0x55c05aa60bd0: i32 = Constant<0>
try.c: 0x55c05aa60bd0: i32 = Constant<0>
try.c: 0x55c05aa60bd0: i32 = Constant<0>
try.c: 0x55c05aa60bd0: i32 = Constant<0>
try.c: 0x55c05aa60bd0: i32 = Constant<0>
try.c: 0x55c05aa60bd0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x563cd9c52060: v4i64 = X86ISD::VTRUNC 0x563cd9c51f30
try.c: 0x563cd9c51f30: v16i32 = vselect 0x563cd9c41330, 0x563cd9bd4230, 0x563cd9c51e00
try.c: 0x563cd9c41330: v4i1 = X86ISD::PCMPGTM 0x563cd9c38700, 0x563cd9c35a10
try.c: 0x563cd9c38700: v4i64 = X86ISD::VBROADCAST 0x563cd9bd46f0
try.c: 0x563cd9bd46f0: i64,ch = load<LD8[%lsr.iv6971]> 0x563cd9b32a20, 0x563cd9bcd740, undef:i64
try.c: 0x563cd9bcd740: i64,ch = CopyFromReg 0x563cd9b32a20, Register:i64 %vreg50
try.c: 0x563cd9c35c70: i64 = Register %vreg50
try.c: 0x563cd9bd1a90: i64 = undef
try.c: 0x563cd9c35a10: v4i64,ch = CopyFromReg 0x563cd9b32a20, Register:v4i64 %vreg13
try.c: 0x563cd9c38f50: v4i64 = Register %vreg13
try.c: 0x563cd9bd4230: v16i32 = X86ISD::VBROADCAST 0x563cd9c38960
try.c: 0x563cd9c38960: i32,ch = load<LD4[ConstantPool]> 0x563cd9b32a20, 0x563cd9bcbd10, undef:i64
try.c: 0x563cd9bcbd10: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x563cd9bd2410: i64 = TargetConstantPool<i32 1> 0
try.c: 0x563cd9bd1a90: i64 = undef
try.c: 0x563cd9c51e00: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x563cd9c51cd0: i32 = Constant<0>
try.c: 0x563cd9c51cd0: i32 = Constant<0>
try.c: 0x563cd9c51cd0: i32 = Constant<0>
try.c: 0x563cd9c51cd0: i32 = Constant<0>
try.c: 0x563cd9c51cd0: i32 = Constant<0>
try.c: 0x563cd9c51cd0: i32 = Constant<0>
try.c: 0x563cd9c51cd0: i32 = Constant<0>
try.c: 0x563cd9c51cd0: i32 = Constant<0>
try.c: 0x563cd9c51cd0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55c73620ee80: v4i64 = X86ISD::VTRUNC 0x55c73620ed50
try.c: 0x55c73620ed50: v16i32 = vselect 0x55c7362243e0, 0x55c7361c1060, 0x55c73620ec20
try.c: 0x55c7362243e0: v4i1 = X86ISD::PCMPGTM 0x55c736209850, 0x55c7362053e0
try.c: 0x55c736209850: v4i64 = X86ISD::VBROADCAST 0x55c7361be200
try.c: 0x55c7361be200: i64,ch = load<LD8[%lsr.iv6971]> 0x55c73611a930, 0x55c7361f3ae0, undef:i64
try.c: 0x55c7361f3ae0: i64,ch = CopyFromReg 0x55c73611a930, Register:i64 %vreg50
try.c: 0x55c736205640: i64 = Register %vreg50
try.c: 0x55c7361bf6d0: i64 = undef
try.c: 0x55c7362053e0: v4i64,ch = CopyFromReg 0x55c73611a930, Register:v4i64 %vreg13
try.c: 0x55c73620a0a0: v4i64 = Register %vreg13
try.c: 0x55c7361c1060: v16i32 = X86ISD::VBROADCAST 0x55c736209ab0
try.c: 0x55c736209ab0: i32,ch = load<LD4[ConstantPool]> 0x55c73611a930, 0x55c7361b2500, undef:i64
try.c: 0x55c7361b2500: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55c7361f4ce0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55c7361bf6d0: i64 = undef
try.c: 0x55c73620ec20: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55c73620eaf0: i32 = Constant<0>
try.c: 0x55c73620eaf0: i32 = Constant<0>
try.c: 0x55c73620eaf0: i32 = Constant<0>
try.c: 0x55c73620eaf0: i32 = Constant<0>
try.c: 0x55c73620eaf0: i32 = Constant<0>
try.c: 0x55c73620eaf0: i32 = Constant<0>
try.c: 0x55c73620eaf0: i32 = Constant<0>
try.c: 0x55c73620eaf0: i32 = Constant<0>
try.c: 0x55c73620eaf0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
core.c: core.c:20:18: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup857_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: __m256i save = _mm256_loadu_si256((__m256i *) (inbytes+2*i));
core.c: ^
core.c: core.c:25:19: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup857_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: __m256i x = _mm256_loadu_si256((__m256i *) inbytes);
core.c: ^
core.c: core.c:27:11: error: always_inline function '_mm256_mullo_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup857_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: x = _mm256_mullo_epi16(x,_mm256_set1_epi16(3));
core.c: ^
core.c: core.c:27:32: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup857_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: x = _mm256_mullo_epi16(x,_mm256_set1_epi16(3));
core.c: ^
core.c: core.c:28:11: error: always_inline function '_mm256_sub_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup857_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: x = _mm256_sub_epi16(x,_mm256_set1_epi16((q+1)/2));
core.c: ^
core.c: core.c:28:30: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup857_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: x = _mm256_sub_epi16(x,_mm256_set1_epi16((q+1)/2));
core.c: ^
core.c: core.c:29:14: error: always_inline function '_mm256_srai_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup857_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: xneg = _mm256_srai_epi16(x,15);
core.c: ^
core.c: core.c:30:11: error: always_inline function '_mm256_add_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup857_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: x = _mm256_add_epi16(x,_mm256_set1_epi16(q)&xneg);
core.c: ^
core.c: core.c:30:30: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup857_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x564422b584b0: v4i64 = X86ISD::VTRUNC 0x564422b58380
try.c: 0x564422b58380: v16i32 = vselect 0x564422b44190, 0x564422b0a4d0, 0x564422b58250
try.c: 0x564422b44190: v4i1 = X86ISD::PCMPGTM 0x564422b3db30, 0x564422b396c0
try.c: 0x564422b3db30: v4i64 = X86ISD::VBROADCAST 0x564422b06c90
try.c: 0x564422b06c90: i64,ch = load<LD8[%lsr.iv6971]> 0x564422a4e9d0, 0x564422b34520, undef:i64
try.c: 0x564422b34520: i64,ch = CopyFromReg 0x564422a4e9d0, Register:i64 %vreg50
try.c: 0x564422b39920: i64 = Register %vreg50
try.c: 0x564422b08b40: i64 = undef
try.c: 0x564422b396c0: v4i64,ch = CopyFromReg 0x564422a4e9d0, Register:v4i64 %vreg13
try.c: 0x564422b3e380: v4i64 = Register %vreg13
try.c: 0x564422b0a4d0: v16i32 = X86ISD::VBROADCAST 0x564422b3dd90
try.c: 0x564422b3dd90: i32,ch = load<LD4[ConstantPool]> 0x564422a4e9d0, 0x564422b06270, undef:i64
try.c: 0x564422b06270: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x564422afb190: i64 = TargetConstantPool<i32 1> 0
try.c: 0x564422b08b40: i64 = undef
try.c: 0x564422b58250: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x564422b58120: i32 = Constant<0>
try.c: 0x564422b58120: i32 = Constant<0>
try.c: 0x564422b58120: i32 = Constant<0>
try.c: 0x564422b58120: i32 = Constant<0>
try.c: 0x564422b58120: i32 = Constant<0>
try.c: 0x564422b58120: i32 = Constant<0>
try.c: 0x564422b58120: i32 = Constant<0>
try.c: 0x564422b58120: i32 = Constant<0>
try.c: 0x564422b58120: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55d018a93920: v4i64 = X86ISD::VTRUNC 0x55d018a937f0
try.c: 0x55d018a937f0: v16i32 = vselect 0x55d018a8e2f0, 0x55d018a0ee00, 0x55d018a936c0
try.c: 0x55d018a8e2f0: v4i1 = X86ISD::PCMPGTM 0x55d018a79f70, 0x55d018a75b00
try.c: 0x55d018a79f70: v4i64 = X86ISD::VBROADCAST 0x55d018a0f2c0
try.c: 0x55d018a0f2c0: i64,ch = load<LD8[%lsr.iv6971]> 0x55d018973a30, 0x55d018a16f10, undef:i64
try.c: 0x55d018a16f10: i64,ch = CopyFromReg 0x55d018973a30, Register:i64 %vreg50
try.c: 0x55d018a75d60: i64 = Register %vreg50
try.c: 0x55d018a237d0: i64 = undef
try.c: 0x55d018a75b00: v4i64,ch = CopyFromReg 0x55d018973a30, Register:v4i64 %vreg13
try.c: 0x55d018a7a7c0: v4i64 = Register %vreg13
try.c: 0x55d018a0ee00: v16i32 = X86ISD::VBROADCAST 0x55d018a7a1d0
try.c: 0x55d018a7a1d0: i32,ch = load<LD4[ConstantPool]> 0x55d018973a30, 0x55d018a154e0, undef:i64
try.c: 0x55d018a154e0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55d018a24150: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55d018a237d0: i64 = undef
try.c: 0x55d018a936c0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55d018a93590: i32 = Constant<0>
try.c: 0x55d018a93590: i32 = Constant<0>
try.c: 0x55d018a93590: i32 = Constant<0>
try.c: 0x55d018a93590: i32 = Constant<0>
try.c: 0x55d018a93590: i32 = Constant<0>
try.c: 0x55d018a93590: i32 = Constant<0>
try.c: 0x55d018a93590: i32 = Constant<0>
try.c: 0x55d018a93590: i32 = Constant<0>
try.c: 0x55d018a93590: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55569eac0c70: v4i64 = X86ISD::VTRUNC 0x55569eac0b40
try.c: 0x55569eac0b40: v16i32 = vselect 0x55569eabd650, 0x55569ea449c0, 0x55569eac0a10
try.c: 0x55569eabd650: v4i1 = X86ISD::PCMPGTM 0x55569ea9d090, 0x55569ea98c20
try.c: 0x55569ea9d090: v4i64 = X86ISD::VBROADCAST 0x55569ea3d9d0
try.c: 0x55569ea3d9d0: i64,ch = load<LD8[%lsr.iv6971]> 0x55569e9ad930, 0x55569ea8f6d0, undef:i64
try.c: 0x55569ea8f6d0: i64,ch = CopyFromReg 0x55569e9ad930, Register:i64 %vreg50
try.c: 0x55569ea98e80: i64 = Register %vreg50
try.c: 0x55569ea43030: i64 = undef
try.c: 0x55569ea98c20: v4i64,ch = CopyFromReg 0x55569e9ad930, Register:v4i64 %vreg13
try.c: 0x55569ea9d8e0: v4i64 = Register %vreg13
try.c: 0x55569ea449c0: v16i32 = X86ISD::VBROADCAST 0x55569ea9d2f0
try.c: 0x55569ea9d2f0: i32,ch = load<LD4[ConstantPool]> 0x55569e9ad930, 0x55569ea3cfb0, undef:i64
try.c: 0x55569ea3cfb0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55569ea5b380: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55569ea43030: i64 = undef
try.c: 0x55569eac0a10: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55569eac08e0: i32 = Constant<0>
try.c: 0x55569eac08e0: i32 = Constant<0>
try.c: 0x55569eac08e0: i32 = Constant<0>
try.c: 0x55569eac08e0: i32 = Constant<0>
try.c: 0x55569eac08e0: i32 = Constant<0>
try.c: 0x55569eac08e0: i32 = Constant<0>
try.c: 0x55569eac08e0: i32 = Constant<0>
try.c: 0x55569eac08e0: i32 = Constant<0>
try.c: 0x55569eac08e0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref