Implementation notes: amd64, cel02, crypto_core/scale3sntrup761

Computer: cel02
Architecture: amd64
CPU ID: GenuineIntel-00050657-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_core
Primitive: scale3sntrup761
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
244306 0 012356 816 800avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
256272 0 010912 800 800avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
372607 0 015997 824 864refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
410305 0 012092 816 800avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
418306 0 015581 824 864avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
452227 0 011052 792 760avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
2104424 0 013210 800 760refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
5754185 0 012036 816 800refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
7246180 0 011092 792 760refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
8094181 0 010988 808 800refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
8540192 0 012300 816 800refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x56480fdbbdf0: v4i64 = X86ISD::VTRUNC 0x56480fdbbcc0
try.c: 0x56480fdbbcc0: v16i32 = vselect 0x56480fdad0a0, 0x56480fd5aa90, 0x56480fdbbb90
try.c: 0x56480fdad0a0: v4i1 = X86ISD::PCMPGTM 0x56480fdb4c70, 0x56480fdb0800
try.c: 0x56480fdb4c70: v4i64 = X86ISD::VBROADCAST 0x56480fd56f10
try.c: 0x56480fd56f10: i64,ch = load<LD8[%lsr.iv6971]> 0x56480fcc5950, 0x56480fd9ec70, undef:i64
try.c: 0x56480fd9ec70: i64,ch = CopyFromReg 0x56480fcc5950, Register:i64 %vreg50
try.c: 0x56480fdb0a60: i64 = Register %vreg50
try.c: 0x56480fd59100: i64 = undef
try.c: 0x56480fdb0800: v4i64,ch = CopyFromReg 0x56480fcc5950, Register:v4i64 %vreg13
try.c: 0x56480fdb54c0: v4i64 = Register %vreg13
try.c: 0x56480fd5aa90: v16i32 = X86ISD::VBROADCAST 0x56480fdb4ed0
try.c: 0x56480fdb4ed0: i32,ch = load<LD4[ConstantPool]> 0x56480fcc5950, 0x56480fd564f0, undef:i64
try.c: 0x56480fd564f0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x56480fd9fe70: i64 = TargetConstantPool<i32 1> 0
try.c: 0x56480fd59100: i64 = undef
try.c: 0x56480fdbbb90: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x56480fdbba60: i32 = Constant<0>
try.c: 0x56480fdbba60: i32 = Constant<0>
try.c: 0x56480fdbba60: i32 = Constant<0>
try.c: 0x56480fdbba60: i32 = Constant<0>
try.c: 0x56480fdbba60: i32 = Constant<0>
try.c: 0x56480fdbba60: i32 = Constant<0>
try.c: 0x56480fdbba60: i32 = Constant<0>
try.c: 0x56480fdbba60: i32 = Constant<0>
try.c: 0x56480fdbba60: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x5643bdf640d0: v4i64 = X86ISD::VTRUNC 0x5643bdf63fa0
try.c: 0x5643bdf63fa0: v16i32 = vselect 0x5643bdf60ab0, 0x5643bdee7620, 0x5643bdf63e70
try.c: 0x5643bdf60ab0: v4i1 = X86ISD::PCMPGTM 0x5643bdf4af10, 0x5643bdf47ab0
try.c: 0x5643bdf4af10: v4i64 = X86ISD::VBROADCAST 0x5643bdee7ae0
try.c: 0x5643bdee7ae0: i64,ch = load<LD8[%lsr.iv6971]> 0x5643bde45a30, 0x5643bdeed210, undef:i64
try.c: 0x5643bdeed210: i64,ch = CopyFromReg 0x5643bde45a30, Register:i64 %vreg50
try.c: 0x5643bdf47d10: i64 = Register %vreg50
try.c: 0x5643bdee32c0: i64 = undef
try.c: 0x5643bdf47ab0: v4i64,ch = CopyFromReg 0x5643bde45a30, Register:v4i64 %vreg13
try.c: 0x5643bdf4b760: v4i64 = Register %vreg13
try.c: 0x5643bdee7620: v16i32 = X86ISD::VBROADCAST 0x5643bdf4b170
try.c: 0x5643bdf4b170: i32,ch = load<LD4[ConstantPool]> 0x5643bde45a30, 0x5643bdee9fc0, undef:i64
try.c: 0x5643bdee9fc0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x5643bdee3c40: i64 = TargetConstantPool<i32 1> 0
try.c: 0x5643bdee32c0: i64 = undef
try.c: 0x5643bdf63e70: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x5643bdf63d40: i32 = Constant<0>
try.c: 0x5643bdf63d40: i32 = Constant<0>
try.c: 0x5643bdf63d40: i32 = Constant<0>
try.c: 0x5643bdf63d40: i32 = Constant<0>
try.c: 0x5643bdf63d40: i32 = Constant<0>
try.c: 0x5643bdf63d40: i32 = Constant<0>
try.c: 0x5643bdf63d40: i32 = Constant<0>
try.c: 0x5643bdf63d40: i32 = Constant<0>
try.c: 0x5643bdf63d40: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x5635dfbfdb00: v4i64 = X86ISD::VTRUNC 0x5635dfbfd9d0
try.c: 0x5635dfbfd9d0: v16i32 = vselect 0x5635dfbf01e0, 0x5635dfb882c0, 0x5635dfbfd8a0
try.c: 0x5635dfbf01e0: v4i1 = X86ISD::PCMPGTM 0x5635dfbe3d30, 0x5635dfbe04d0
try.c: 0x5635dfbe3d30: v4i64 = X86ISD::VBROADCAST 0x5635dfb8a6d0
try.c: 0x5635dfb8a6d0: i64,ch = load<LD8[%lsr.iv6971]> 0x5635dfaf4940, 0x5635dfbda3f0, undef:i64
try.c: 0x5635dfbda3f0: i64,ch = CopyFromReg 0x5635dfaf4940, Register:i64 %vreg50
try.c: 0x5635dfbe0730: i64 = Register %vreg50
try.c: 0x5635dfb86930: i64 = undef
try.c: 0x5635dfbe04d0: v4i64,ch = CopyFromReg 0x5635dfaf4940, Register:v4i64 %vreg13
try.c: 0x5635dfbe4580: v4i64 = Register %vreg13
try.c: 0x5635dfb882c0: v16i32 = X86ISD::VBROADCAST 0x5635dfbe3f90
try.c: 0x5635dfbe3f90: i32,ch = load<LD4[ConstantPool]> 0x5635dfaf4940, 0x5635dfb89cb0, undef:i64
try.c: 0x5635dfb89cb0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x5635dfba5970: i64 = TargetConstantPool<i32 1> 0
try.c: 0x5635dfb86930: i64 = undef
try.c: 0x5635dfbfd8a0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x5635dfbfd770: i32 = Constant<0>
try.c: 0x5635dfbfd770: i32 = Constant<0>
try.c: 0x5635dfbfd770: i32 = Constant<0>
try.c: 0x5635dfbfd770: i32 = Constant<0>
try.c: 0x5635dfbfd770: i32 = Constant<0>
try.c: 0x5635dfbfd770: i32 = Constant<0>
try.c: 0x5635dfbfd770: i32 = Constant<0>
try.c: 0x5635dfbfd770: i32 = Constant<0>
try.c: 0x5635dfbfd770: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
core.c: core.c:20:18: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup761_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: __m256i save = _mm256_loadu_si256((__m256i *) (inbytes+2*i));
core.c: ^
core.c: core.c:25:19: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup761_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: __m256i x = _mm256_loadu_si256((__m256i *) inbytes);
core.c: ^
core.c: core.c:27:11: error: always_inline function '_mm256_mullo_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup761_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: x = _mm256_mullo_epi16(x,_mm256_set1_epi16(3));
core.c: ^
core.c: core.c:27:32: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup761_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: x = _mm256_mullo_epi16(x,_mm256_set1_epi16(3));
core.c: ^
core.c: core.c:28:11: error: always_inline function '_mm256_sub_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup761_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: x = _mm256_sub_epi16(x,_mm256_set1_epi16((q+1)/2));
core.c: ^
core.c: core.c:28:30: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup761_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: x = _mm256_sub_epi16(x,_mm256_set1_epi16((q+1)/2));
core.c: ^
core.c: core.c:29:14: error: always_inline function '_mm256_srai_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup761_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: xneg = _mm256_srai_epi16(x,15);
core.c: ^
core.c: core.c:30:11: error: always_inline function '_mm256_add_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup761_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: x = _mm256_add_epi16(x,_mm256_set1_epi16(q)&xneg);
core.c: ^
core.c: core.c:30:30: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup761_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55afed69df10: v4i64 = X86ISD::VTRUNC 0x55afed69dde0
try.c: 0x55afed69dde0: v16i32 = vselect 0x55afed6a53f0, 0x55afed63d210, 0x55afed69dcb0
try.c: 0x55afed6a53f0: v4i1 = X86ISD::PCMPGTM 0x55afed6998f0, 0x55afed695480
try.c: 0x55afed6998f0: v4i64 = X86ISD::VBROADCAST 0x55afed65eda0
try.c: 0x55afed65eda0: i64,ch = load<LD8[%lsr.iv6971]> 0x55afed5aa950, 0x55afed683380, undef:i64
try.c: 0x55afed683380: i64,ch = CopyFromReg 0x55afed5aa950, Register:i64 %vreg50
try.c: 0x55afed6956e0: i64 = Register %vreg50
try.c: 0x55afed660270: i64 = undef
try.c: 0x55afed695480: v4i64,ch = CopyFromReg 0x55afed5aa950, Register:v4i64 %vreg13
try.c: 0x55afed69a140: v4i64 = Register %vreg13
try.c: 0x55afed63d210: v16i32 = X86ISD::VBROADCAST 0x55afed699b50
try.c: 0x55afed699b50: i32,ch = load<LD4[ConstantPool]> 0x55afed5aa950, 0x55afed640b70, undef:i64
try.c: 0x55afed640b70: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55afed639c80: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55afed660270: i64 = undef
try.c: 0x55afed69dcb0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55afed69db80: i32 = Constant<0>
try.c: 0x55afed69db80: i32 = Constant<0>
try.c: 0x55afed69db80: i32 = Constant<0>
try.c: 0x55afed69db80: i32 = Constant<0>
try.c: 0x55afed69db80: i32 = Constant<0>
try.c: 0x55afed69db80: i32 = Constant<0>
try.c: 0x55afed69db80: i32 = Constant<0>
try.c: 0x55afed69db80: i32 = Constant<0>
try.c: 0x55afed69db80: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55a3ec1600a0: v4i64 = X86ISD::VTRUNC 0x55a3ec15ff70
try.c: 0x55a3ec15ff70: v16i32 = vselect 0x55a3ec15aa70, 0x55a3ec0e34f0, 0x55a3ec15fe40
try.c: 0x55a3ec15aa70: v4i1 = X86ISD::PCMPGTM 0x55a3ec145ed0, 0x55a3ec142a70
try.c: 0x55a3ec145ed0: v4i64 = X86ISD::VBROADCAST 0x55a3ec0e39b0
try.c: 0x55a3ec0e39b0: i64,ch = load<LD8[%lsr.iv6971]> 0x55a3ec040a00, 0x55a3ec0da5e0, undef:i64
try.c: 0x55a3ec0da5e0: i64,ch = CopyFromReg 0x55a3ec040a00, Register:i64 %vreg50
try.c: 0x55a3ec142cd0: i64 = Register %vreg50
try.c: 0x55a3ec0d3920: i64 = undef
try.c: 0x55a3ec142a70: v4i64,ch = CopyFromReg 0x55a3ec040a00, Register:v4i64 %vreg13
try.c: 0x55a3ec146720: v4i64 = Register %vreg13
try.c: 0x55a3ec0e34f0: v16i32 = X86ISD::VBROADCAST 0x55a3ec146130
try.c: 0x55a3ec146130: i32,ch = load<LD4[ConstantPool]> 0x55a3ec040a00, 0x55a3ec0f1460, undef:i64
try.c: 0x55a3ec0f1460: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55a3ec0d42a0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55a3ec0d3920: i64 = undef
try.c: 0x55a3ec15fe40: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55a3ec15fd10: i32 = Constant<0>
try.c: 0x55a3ec15fd10: i32 = Constant<0>
try.c: 0x55a3ec15fd10: i32 = Constant<0>
try.c: 0x55a3ec15fd10: i32 = Constant<0>
try.c: 0x55a3ec15fd10: i32 = Constant<0>
try.c: 0x55a3ec15fd10: i32 = Constant<0>
try.c: 0x55a3ec15fd10: i32 = Constant<0>
try.c: 0x55a3ec15fd10: i32 = Constant<0>
try.c: 0x55a3ec15fd10: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55e3ff39e290: v4i64 = X86ISD::VTRUNC 0x55e3ff39e160
try.c: 0x55e3ff39e160: v16i32 = vselect 0x55e3ff3ae3f0, 0x55e3ff33a5f0, 0x55e3ff39e030
try.c: 0x55e3ff3ae3f0: v4i1 = X86ISD::PCMPGTM 0x55e3ff397080, 0x55e3ff392c10
try.c: 0x55e3ff397080: v4i64 = X86ISD::VBROADCAST 0x55e3ff35aec0
try.c: 0x55e3ff35aec0: i64,ch = load<LD8[%lsr.iv6971]> 0x55e3ff2a79a0, 0x55e3ff389800, undef:i64
try.c: 0x55e3ff389800: i64,ch = CopyFromReg 0x55e3ff2a79a0, Register:i64 %vreg50
try.c: 0x55e3ff392e70: i64 = Register %vreg50
try.c: 0x55e3ff35c390: i64 = undef
try.c: 0x55e3ff392c10: v4i64,ch = CopyFromReg 0x55e3ff2a79a0, Register:v4i64 %vreg13
try.c: 0x55e3ff3978d0: v4i64 = Register %vreg13
try.c: 0x55e3ff33a5f0: v16i32 = X86ISD::VBROADCAST 0x55e3ff3972e0
try.c: 0x55e3ff3972e0: i32,ch = load<LD4[ConstantPool]> 0x55e3ff2a79a0, 0x55e3ff345bb0, undef:i64
try.c: 0x55e3ff345bb0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55e3ff336280: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55e3ff35c390: i64 = undef
try.c: 0x55e3ff39e030: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55e3ff39df00: i32 = Constant<0>
try.c: 0x55e3ff39df00: i32 = Constant<0>
try.c: 0x55e3ff39df00: i32 = Constant<0>
try.c: 0x55e3ff39df00: i32 = Constant<0>
try.c: 0x55e3ff39df00: i32 = Constant<0>
try.c: 0x55e3ff39df00: i32 = Constant<0>
try.c: 0x55e3ff39df00: i32 = Constant<0>
try.c: 0x55e3ff39df00: i32 = Constant<0>
try.c: 0x55e3ff39df00: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref