Implementation notes: amd64, cel02, crypto_core/scale3sntrup653

Computer: cel02
Architecture: amd64
CPU ID: GenuineIntel-00050657-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_core
Primitive: scale3sntrup653
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
226305 0 012092 816 800avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
356306 0 012356 816 800avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
386306 0 015581 824 864avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
398227 0 011052 792 760avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
398272 0 010912 800 800avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
4001171 0 016573 824 864refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
1178644 0 013434 800 760refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
4830185 0 012036 816 800refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
6158180 0 011092 792 760refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
6716181 0 010988 808 800refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
8788192 0 012300 816 800refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x5570c22024f0: v4i64 = X86ISD::VTRUNC 0x5570c22023c0
try.c: 0x5570c22023c0: v16i32 = vselect 0x5570c21ee7d0, 0x5570c218c310, 0x5570c2202290
try.c: 0x5570c21ee7d0: v4i1 = X86ISD::PCMPGTM 0x5570c21e7b70, 0x5570c21e3700
try.c: 0x5570c21e7b70: v4i64 = X86ISD::VBROADCAST 0x5570c2191b30
try.c: 0x5570c2191b30: i64,ch = load<LD8[%lsr.iv6971]> 0x5570c20f8950, 0x5570c21de560, undef:i64
try.c: 0x5570c21de560: i64,ch = CopyFromReg 0x5570c20f8950, Register:i64 %vreg50
try.c: 0x5570c21e3960: i64 = Register %vreg50
try.c: 0x5570c218a980: i64 = undef
try.c: 0x5570c21e3700: v4i64,ch = CopyFromReg 0x5570c20f8950, Register:v4i64 %vreg13
try.c: 0x5570c21e83c0: v4i64 = Register %vreg13
try.c: 0x5570c218c310: v16i32 = X86ISD::VBROADCAST 0x5570c21e7dd0
try.c: 0x5570c21e7dd0: i32,ch = load<LD4[ConstantPool]> 0x5570c20f8950, 0x5570c2191110, undef:i64
try.c: 0x5570c2191110: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x5570c21cd070: i64 = TargetConstantPool<i32 1> 0
try.c: 0x5570c218a980: i64 = undef
try.c: 0x5570c2202290: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x5570c2202160: i32 = Constant<0>
try.c: 0x5570c2202160: i32 = Constant<0>
try.c: 0x5570c2202160: i32 = Constant<0>
try.c: 0x5570c2202160: i32 = Constant<0>
try.c: 0x5570c2202160: i32 = Constant<0>
try.c: 0x5570c2202160: i32 = Constant<0>
try.c: 0x5570c2202160: i32 = Constant<0>
try.c: 0x5570c2202160: i32 = Constant<0>
try.c: 0x5570c2202160: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55b79f86ee20: v4i64 = X86ISD::VTRUNC 0x55b79f86ecf0
try.c: 0x55b79f86ecf0: v16i32 = vselect 0x55b79f864260, 0x55b79f7e64f0, 0x55b79f86ebc0
try.c: 0x55b79f864260: v4i1 = X86ISD::PCMPGTM 0x55b79f84b0e0, 0x55b79f846660
try.c: 0x55b79f84b0e0: v4i64 = X86ISD::VBROADCAST 0x55b79f7e69b0
try.c: 0x55b79f7e69b0: i64,ch = load<LD8[%lsr.iv6971]> 0x55b79f744a40, 0x55b79f80aff0, undef:i64
try.c: 0x55b79f80aff0: i64,ch = CopyFromReg 0x55b79f744a40, Register:i64 %vreg50
try.c: 0x55b79f8468c0: i64 = Register %vreg50
try.c: 0x55b79f7c3cd0: i64 = undef
try.c: 0x55b79f846660: v4i64,ch = CopyFromReg 0x55b79f744a40, Register:v4i64 %vreg13
try.c: 0x55b79f84b930: v4i64 = Register %vreg13
try.c: 0x55b79f7e64f0: v16i32 = X86ISD::VBROADCAST 0x55b79f84b340
try.c: 0x55b79f84b340: i32,ch = load<LD4[ConstantPool]> 0x55b79f744a40, 0x55b79f8095c0, undef:i64
try.c: 0x55b79f8095c0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55b79f7c4650: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55b79f7c3cd0: i64 = undef
try.c: 0x55b79f86ebc0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55b79f86ea90: i32 = Constant<0>
try.c: 0x55b79f86ea90: i32 = Constant<0>
try.c: 0x55b79f86ea90: i32 = Constant<0>
try.c: 0x55b79f86ea90: i32 = Constant<0>
try.c: 0x55b79f86ea90: i32 = Constant<0>
try.c: 0x55b79f86ea90: i32 = Constant<0>
try.c: 0x55b79f86ea90: i32 = Constant<0>
try.c: 0x55b79f86ea90: i32 = Constant<0>
try.c: 0x55b79f86ea90: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x56056cc45010: v4i64 = X86ISD::VTRUNC 0x56056cc44ee0
try.c: 0x56056cc44ee0: v16i32 = vselect 0x56056cc42f60, 0x56056cbeba10, 0x56056cc44db0
try.c: 0x56056cc42f60: v4i1 = X86ISD::PCMPGTM 0x56056cc3ff40, 0x56056cc3bad0
try.c: 0x56056cc3ff40: v4i64 = X86ISD::VBROADCAST 0x56056cbe2d60
try.c: 0x56056cbe2d60: i64,ch = load<LD8[%lsr.iv6971]> 0x56056cb50900, 0x56056cc32380, undef:i64
try.c: 0x56056cc32380: i64,ch = CopyFromReg 0x56056cb50900, Register:i64 %vreg50
try.c: 0x56056cc3bd30: i64 = Register %vreg50
try.c: 0x56056cbe4230: i64 = undef
try.c: 0x56056cc3bad0: v4i64,ch = CopyFromReg 0x56056cb50900, Register:v4i64 %vreg13
try.c: 0x56056cc40790: v4i64 = Register %vreg13
try.c: 0x56056cbeba10: v16i32 = X86ISD::VBROADCAST 0x56056cc401a0
try.c: 0x56056cc401a0: i32,ch = load<LD4[ConstantPool]> 0x56056cb50900, 0x56056cbe1700, undef:i64
try.c: 0x56056cbe1700: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x56056cc25b50: i64 = TargetConstantPool<i32 1> 0
try.c: 0x56056cbe4230: i64 = undef
try.c: 0x56056cc44db0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x56056cc44c80: i32 = Constant<0>
try.c: 0x56056cc44c80: i32 = Constant<0>
try.c: 0x56056cc44c80: i32 = Constant<0>
try.c: 0x56056cc44c80: i32 = Constant<0>
try.c: 0x56056cc44c80: i32 = Constant<0>
try.c: 0x56056cc44c80: i32 = Constant<0>
try.c: 0x56056cc44c80: i32 = Constant<0>
try.c: 0x56056cc44c80: i32 = Constant<0>
try.c: 0x56056cc44c80: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
core.c: core.c:20:18: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: __m256i save = _mm256_loadu_si256((__m256i *) (inbytes+2*i));
core.c: ^
core.c: core.c:25:19: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: __m256i x = _mm256_loadu_si256((__m256i *) inbytes);
core.c: ^
core.c: core.c:27:11: error: always_inline function '_mm256_mullo_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: x = _mm256_mullo_epi16(x,_mm256_set1_epi16(3));
core.c: ^
core.c: core.c:27:32: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: x = _mm256_mullo_epi16(x,_mm256_set1_epi16(3));
core.c: ^
core.c: core.c:28:11: error: always_inline function '_mm256_sub_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: x = _mm256_sub_epi16(x,_mm256_set1_epi16((q+1)/2));
core.c: ^
core.c: core.c:28:30: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: x = _mm256_sub_epi16(x,_mm256_set1_epi16((q+1)/2));
core.c: ^
core.c: core.c:29:14: error: always_inline function '_mm256_srai_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: xneg = _mm256_srai_epi16(x,15);
core.c: ^
core.c: core.c:30:11: error: always_inline function '_mm256_add_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: x = _mm256_add_epi16(x,_mm256_set1_epi16(q)&xneg);
core.c: ^
core.c: core.c:30:30: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_scale3sntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
core.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x563ebb4fd570: v4i64 = X86ISD::VTRUNC 0x563ebb4fd440
try.c: 0x563ebb4fd440: v16i32 = vselect 0x563ebb519cc0, 0x563ebb4aef90, 0x563ebb4fd310
try.c: 0x563ebb519cc0: v4i1 = X86ISD::PCMPGTM 0x563ebb4f58a0, 0x563ebb4f1430
try.c: 0x563ebb4f58a0: v4i64 = X86ISD::VBROADCAST 0x563ebb4989e0
try.c: 0x563ebb4989e0: i64,ch = load<LD8[%lsr.iv6971]> 0x563ebb406970, 0x563ebb4e3150, undef:i64
try.c: 0x563ebb4e3150: i64,ch = CopyFromReg 0x563ebb406970, Register:i64 %vreg50
try.c: 0x563ebb4f1690: i64 = Register %vreg50
try.c: 0x563ebb499eb0: i64 = undef
try.c: 0x563ebb4f1430: v4i64,ch = CopyFromReg 0x563ebb406970, Register:v4i64 %vreg13
try.c: 0x563ebb4f60f0: v4i64 = Register %vreg13
try.c: 0x563ebb4aef90: v16i32 = X86ISD::VBROADCAST 0x563ebb4f5b00
try.c: 0x563ebb4f5b00: i32,ch = load<LD4[ConstantPool]> 0x563ebb406970, 0x563ebb4a4e30, undef:i64
try.c: 0x563ebb4a4e30: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x563ebb4dd270: i64 = TargetConstantPool<i32 1> 0
try.c: 0x563ebb499eb0: i64 = undef
try.c: 0x563ebb4fd310: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x563ebb4fd1e0: i32 = Constant<0>
try.c: 0x563ebb4fd1e0: i32 = Constant<0>
try.c: 0x563ebb4fd1e0: i32 = Constant<0>
try.c: 0x563ebb4fd1e0: i32 = Constant<0>
try.c: 0x563ebb4fd1e0: i32 = Constant<0>
try.c: 0x563ebb4fd1e0: i32 = Constant<0>
try.c: 0x563ebb4fd1e0: i32 = Constant<0>
try.c: 0x563ebb4fd1e0: i32 = Constant<0>
try.c: 0x563ebb4fd1e0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55933ae86a20: v4i64 = X86ISD::VTRUNC 0x55933ae868f0
try.c: 0x55933ae868f0: v16i32 = vselect 0x55933ae83400, 0x55933ae02ee0, 0x55933ae867c0
try.c: 0x55933ae83400: v4i1 = X86ISD::PCMPGTM 0x55933ae6dcb0, 0x55933ae6b7d0
try.c: 0x55933ae6dcb0: v4i64 = X86ISD::VBROADCAST 0x55933ae033a0
try.c: 0x55933ae033a0: i64,ch = load<LD8[%lsr.iv6971]> 0x55933ad68a40, 0x55933ae19510, undef:i64
try.c: 0x55933ae19510: i64,ch = CopyFromReg 0x55933ad68a40, Register:i64 %vreg50
try.c: 0x55933ae6ba30: i64 = Register %vreg50
try.c: 0x55933ae062c0: i64 = undef
try.c: 0x55933ae6b7d0: v4i64,ch = CopyFromReg 0x55933ad68a40, Register:v4i64 %vreg13
try.c: 0x55933ae6e500: v4i64 = Register %vreg13
try.c: 0x55933ae02ee0: v16i32 = X86ISD::VBROADCAST 0x55933ae6df10
try.c: 0x55933ae6df10: i32,ch = load<LD4[ConstantPool]> 0x55933ad68a40, 0x55933ae17ae0, undef:i64
try.c: 0x55933ae17ae0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55933ae06c40: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55933ae062c0: i64 = undef
try.c: 0x55933ae867c0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55933ae86690: i32 = Constant<0>
try.c: 0x55933ae86690: i32 = Constant<0>
try.c: 0x55933ae86690: i32 = Constant<0>
try.c: 0x55933ae86690: i32 = Constant<0>
try.c: 0x55933ae86690: i32 = Constant<0>
try.c: 0x55933ae86690: i32 = Constant<0>
try.c: 0x55933ae86690: i32 = Constant<0>
try.c: 0x55933ae86690: i32 = Constant<0>
try.c: 0x55933ae86690: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55c53ca7e370: v4i64 = X86ISD::VTRUNC 0x55c53ca7e240
try.c: 0x55c53ca7e240: v16i32 = vselect 0x55c53ca8a200, 0x55c53ca0cbc0, 0x55c53ca7e110
try.c: 0x55c53ca8a200: v4i1 = X86ISD::PCMPGTM 0x55c53ca66a10, 0x55c53ca625a0
try.c: 0x55c53ca66a10: v4i64 = X86ISD::VBROADCAST 0x55c53ca09d60
try.c: 0x55c53ca09d60: i64,ch = load<LD8[%lsr.iv6971]> 0x55c53c977940, 0x55c53ca52140, undef:i64
try.c: 0x55c53ca52140: i64,ch = CopyFromReg 0x55c53c977940, Register:i64 %vreg50
try.c: 0x55c53ca62800: i64 = Register %vreg50
try.c: 0x55c53ca0b230: i64 = undef
try.c: 0x55c53ca625a0: v4i64,ch = CopyFromReg 0x55c53c977940, Register:v4i64 %vreg13
try.c: 0x55c53ca67260: v4i64 = Register %vreg13
try.c: 0x55c53ca0cbc0: v16i32 = X86ISD::VBROADCAST 0x55c53ca66c70
try.c: 0x55c53ca66c70: i32,ch = load<LD4[ConstantPool]> 0x55c53c977940, 0x55c53ca0f3b0, undef:i64
try.c: 0x55c53ca0f3b0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55c53ca56300: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55c53ca0b230: i64 = undef
try.c: 0x55c53ca7e110: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55c53ca7dfe0: i32 = Constant<0>
try.c: 0x55c53ca7dfe0: i32 = Constant<0>
try.c: 0x55c53ca7dfe0: i32 = Constant<0>
try.c: 0x55c53ca7dfe0: i32 = Constant<0>
try.c: 0x55c53ca7dfe0: i32 = Constant<0>
try.c: 0x55c53ca7dfe0: i32 = Constant<0>
try.c: 0x55c53ca7dfe0: i32 = Constant<0>
try.c: 0x55c53ca7dfe0: i32 = Constant<0>
try.c: 0x55c53ca7dfe0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref