Implementation notes: amd64, cel02, crypto_core/weightsntrup653

Computer: cel02
Architecture: amd64
CPU ID: GenuineIntel-00050657-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_core
Primitive: weightsntrup653
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
72254 0 011092 792 760avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
72321 0 011016 800 800avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
104331 0 015637 824 864avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
110331 0 012412 816 800avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
126323 0 012148 816 800avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
1501186 0 016533 824 864refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
256275 0 012962 800 760refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
1116100 0 010924 792 760refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121120201130
156097 0 010752 800 800refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
1840106 0 011900 816 800refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130
2000103 0 012164 816 800refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121120201130

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x563e90763c40: v4i64 = X86ISD::VTRUNC 0x563e90763b10
try.c: 0x563e90763b10: v16i32 = vselect 0x563e90760620, 0x563e90703920, 0x563e907639e0
try.c: 0x563e90760620: v4i1 = X86ISD::PCMPGTM 0x563e9075b9d0, 0x563e90757560
try.c: 0x563e9075b9d0: v4i64 = X86ISD::VBROADCAST 0x563e906ffb10
try.c: 0x563e906ffb10: i64,ch = load<LD8[%lsr.iv6971]> 0x563e9066c950, 0x563e90745d50, undef:i64
try.c: 0x563e90745d50: i64,ch = CopyFromReg 0x563e9066c950, Register:i64 %vreg50
try.c: 0x563e907577c0: i64 = Register %vreg50
try.c: 0x563e90701f90: i64 = undef
try.c: 0x563e90757560: v4i64,ch = CopyFromReg 0x563e9066c950, Register:v4i64 %vreg13
try.c: 0x563e9075c220: v4i64 = Register %vreg13
try.c: 0x563e90703920: v16i32 = X86ISD::VBROADCAST 0x563e9075bc30
try.c: 0x563e9075bc30: i32,ch = load<LD4[ConstantPool]> 0x563e9066c950, 0x563e906ff0f0, undef:i64
try.c: 0x563e906ff0f0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x563e906e4e40: i64 = TargetConstantPool<i32 1> 0
try.c: 0x563e90701f90: i64 = undef
try.c: 0x563e907639e0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x563e907638b0: i32 = Constant<0>
try.c: 0x563e907638b0: i32 = Constant<0>
try.c: 0x563e907638b0: i32 = Constant<0>
try.c: 0x563e907638b0: i32 = Constant<0>
try.c: 0x563e907638b0: i32 = Constant<0>
try.c: 0x563e907638b0: i32 = Constant<0>
try.c: 0x563e907638b0: i32 = Constant<0>
try.c: 0x563e907638b0: i32 = Constant<0>
try.c: 0x563e907638b0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x555a14b37120: v4i64 = X86ISD::VTRUNC 0x555a14b36ff0
try.c: 0x555a14b36ff0: v16i32 = vselect 0x555a14b3dde0, 0x555a14ad3230, 0x555a14b36ec0
try.c: 0x555a14b3dde0: v4i1 = X86ISD::PCMPGTM 0x555a14b1ee00, 0x555a14b1c920
try.c: 0x555a14b1ee00: v4i64 = X86ISD::VBROADCAST 0x555a14ad36f0
try.c: 0x555a14ad36f0: i64,ch = load<LD8[%lsr.iv6971]> 0x555a14a19a30, 0x555a14ad7600, undef:i64
try.c: 0x555a14ad7600: i64,ch = CopyFromReg 0x555a14a19a30, Register:i64 %vreg50
try.c: 0x555a14b1cb80: i64 = Register %vreg50
try.c: 0x555a14ab7f40: i64 = undef
try.c: 0x555a14b1c920: v4i64,ch = CopyFromReg 0x555a14a19a30, Register:v4i64 %vreg13
try.c: 0x555a14b1f650: v4i64 = Register %vreg13
try.c: 0x555a14ad3230: v16i32 = X86ISD::VBROADCAST 0x555a14b1f060
try.c: 0x555a14b1f060: i32,ch = load<LD4[ConstantPool]> 0x555a14a19a30, 0x555a14ad5bd0, undef:i64
try.c: 0x555a14ad5bd0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x555a14ab88c0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x555a14ab7f40: i64 = undef
try.c: 0x555a14b36ec0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x555a14b36d90: i32 = Constant<0>
try.c: 0x555a14b36d90: i32 = Constant<0>
try.c: 0x555a14b36d90: i32 = Constant<0>
try.c: 0x555a14b36d90: i32 = Constant<0>
try.c: 0x555a14b36d90: i32 = Constant<0>
try.c: 0x555a14b36d90: i32 = Constant<0>
try.c: 0x555a14b36d90: i32 = Constant<0>
try.c: 0x555a14b36d90: i32 = Constant<0>
try.c: 0x555a14b36d90: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x56097d803bb0: v4i64 = X86ISD::VTRUNC 0x56097d803a80
try.c: 0x56097d803a80: v16i32 = vselect 0x56097d800590, 0x56097d7b6360, 0x56097d803950
try.c: 0x56097d800590: v4i1 = X86ISD::PCMPGTM 0x56097d7fb940, 0x56097d7f74d0
try.c: 0x56097d7fb940: v4i64 = X86ISD::VBROADCAST 0x56097d7a3100
try.c: 0x56097d7a3100: i64,ch = load<LD8[%lsr.iv6971]> 0x56097d70c950, 0x56097d7e25c0, undef:i64
try.c: 0x56097d7e25c0: i64,ch = CopyFromReg 0x56097d70c950, Register:i64 %vreg50
try.c: 0x56097d7f7730: i64 = Register %vreg50
try.c: 0x56097d7b49d0: i64 = undef
try.c: 0x56097d7f74d0: v4i64,ch = CopyFromReg 0x56097d70c950, Register:v4i64 %vreg13
try.c: 0x56097d7fc190: v4i64 = Register %vreg13
try.c: 0x56097d7b6360: v16i32 = X86ISD::VBROADCAST 0x56097d7fbba0
try.c: 0x56097d7fbba0: i32,ch = load<LD4[ConstantPool]> 0x56097d70c950, 0x56097d7a26e0, undef:i64
try.c: 0x56097d7a26e0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x56097d7e68b0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x56097d7b49d0: i64 = undef
try.c: 0x56097d803950: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x56097d803820: i32 = Constant<0>
try.c: 0x56097d803820: i32 = Constant<0>
try.c: 0x56097d803820: i32 = Constant<0>
try.c: 0x56097d803820: i32 = Constant<0>
try.c: 0x56097d803820: i32 = Constant<0>
try.c: 0x56097d803820: i32 = Constant<0>
try.c: 0x56097d803820: i32 = Constant<0>
try.c: 0x56097d803820: i32 = Constant<0>
try.c: 0x56097d803820: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
weight.c: weight.c:20:9: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_weightsntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
weight.c: sum = _mm256_loadu_si256((__m256i *) (in+p-32));
weight.c: ^
weight.c: weight.c:21:10: error: always_inline function '_mm256_set_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_weightsntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
weight.c: sum &= endingmask;
weight.c: ^
weight.c: ./params.h:2:20: note: expanded from macro 'endingmask'
weight.c: #define endingmask _mm256_set_epi8(1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)
weight.c: ^
weight.c: weight.c:24:20: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_weightsntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
weight.c: __m256i bits = _mm256_loadu_si256((__m256i *) in);
weight.c: ^
weight.c: weight.c:25:13: error: always_inline function '_mm256_set1_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_weightsntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
weight.c: bits &= _mm256_set1_epi8(1);
weight.c: ^
weight.c: weight.c:26:11: error: always_inline function '_mm256_add_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_weightsntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
weight.c: sum = _mm256_add_epi8(sum,bits);
weight.c: ^
weight.c: weight.c:31:11: error: always_inline function '_mm256_srli_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_weightsntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
weight.c: sumhi = _mm256_srli_epi16(sum,8);
weight.c: ^
weight.c: weight.c:32:10: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_weightsntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
weight.c: sum &= _mm256_set1_epi16(0xff);
weight.c: ^
weight.c: weight.c:33:9: error: always_inline function '_mm256_add_epi16' requires target feature 'sse4.2', but would be inlined into function 'crypto_core_weightsntrup653_avx_constbranchindex' that is compiled without support for 'sse4.2'
weight.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55a0e32d95b0: v4i64 = X86ISD::VTRUNC 0x55a0e32d9480
try.c: 0x55a0e32d9480: v16i32 = vselect 0x55a0e32d3f80, 0x55a0e328dd70, 0x55a0e32d9350
try.c: 0x55a0e32d3f80: v4i1 = X86ISD::PCMPGTM 0x55a0e32d2f70, 0x55a0e32ceb00
try.c: 0x55a0e32d2f70: v4i64 = X86ISD::VBROADCAST 0x55a0e3281a40
try.c: 0x55a0e3281a40: i64,ch = load<LD8[%lsr.iv6971]> 0x55a0e31e39d0, 0x55a0e32c61f0, undef:i64
try.c: 0x55a0e32c61f0: i64,ch = CopyFromReg 0x55a0e31e39d0, Register:i64 %vreg50
try.c: 0x55a0e32ced60: i64 = Register %vreg50
try.c: 0x55a0e328c3e0: i64 = undef
try.c: 0x55a0e32ceb00: v4i64,ch = CopyFromReg 0x55a0e31e39d0, Register:v4i64 %vreg13
try.c: 0x55a0e32d37c0: v4i64 = Register %vreg13
try.c: 0x55a0e328dd70: v16i32 = X86ISD::VBROADCAST 0x55a0e32d31d0
try.c: 0x55a0e32d31d0: i32,ch = load<LD4[ConstantPool]> 0x55a0e31e39d0, 0x55a0e3281020, undef:i64
try.c: 0x55a0e3281020: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55a0e32401f0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55a0e328c3e0: i64 = undef
try.c: 0x55a0e32d9350: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55a0e32d9220: i32 = Constant<0>
try.c: 0x55a0e32d9220: i32 = Constant<0>
try.c: 0x55a0e32d9220: i32 = Constant<0>
try.c: 0x55a0e32d9220: i32 = Constant<0>
try.c: 0x55a0e32d9220: i32 = Constant<0>
try.c: 0x55a0e32d9220: i32 = Constant<0>
try.c: 0x55a0e32d9220: i32 = Constant<0>
try.c: 0x55a0e32d9220: i32 = Constant<0>
try.c: 0x55a0e32d9220: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x56165bac2270: v4i64 = X86ISD::VTRUNC 0x56165bac2140
try.c: 0x56165bac2140: v16i32 = vselect 0x56165baa1370, 0x56165ba391c0, 0x56165bac2010
try.c: 0x56165baa1370: v4i1 = X86ISD::PCMPGTM 0x56165ba9e340, 0x56165ba99710
try.c: 0x56165ba9e340: v4i64 = X86ISD::VBROADCAST 0x56165ba39680
try.c: 0x56165ba39680: i64,ch = load<LD8[%lsr.iv6971]> 0x56165b997a30, 0x56165ba327e0, undef:i64
try.c: 0x56165ba327e0: i64,ch = CopyFromReg 0x56165b997a30, Register:i64 %vreg50
try.c: 0x56165ba99970: i64 = Register %vreg50
try.c: 0x56165ba352c0: i64 = undef
try.c: 0x56165ba99710: v4i64,ch = CopyFromReg 0x56165b997a30, Register:v4i64 %vreg13
try.c: 0x56165ba9eb90: v4i64 = Register %vreg13
try.c: 0x56165ba391c0: v16i32 = X86ISD::VBROADCAST 0x56165ba9e5a0
try.c: 0x56165ba9e5a0: i32,ch = load<LD4[ConstantPool]> 0x56165b997a30, 0x56165ba30db0, undef:i64
try.c: 0x56165ba30db0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x56165ba35c40: i64 = TargetConstantPool<i32 1> 0
try.c: 0x56165ba352c0: i64 = undef
try.c: 0x56165bac2010: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x56165bac1ee0: i32 = Constant<0>
try.c: 0x56165bac1ee0: i32 = Constant<0>
try.c: 0x56165bac1ee0: i32 = Constant<0>
try.c: 0x56165bac1ee0: i32 = Constant<0>
try.c: 0x56165bac1ee0: i32 = Constant<0>
try.c: 0x56165bac1ee0: i32 = Constant<0>
try.c: 0x56165bac1ee0: i32 = Constant<0>
try.c: 0x56165bac1ee0: i32 = Constant<0>
try.c: 0x56165bac1ee0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: fatal error: error in backend: Cannot select: 0x55f846176280: v4i64 = X86ISD::VTRUNC 0x55f846176150
try.c: 0x55f846176150: v16i32 = vselect 0x55f846172c60, 0x55f846116220, 0x55f846176020
try.c: 0x55f846172c60: v4i1 = X86ISD::PCMPGTM 0x55f84615b8f0, 0x55f846157480
try.c: 0x55f84615b8f0: v4i64 = X86ISD::VBROADCAST 0x55f846102980
try.c: 0x55f846102980: i64,ch = load<LD8[%lsr.iv6971]> 0x55f84606c950, 0x55f84614e460, undef:i64
try.c: 0x55f84614e460: i64,ch = CopyFromReg 0x55f84606c950, Register:i64 %vreg50
try.c: 0x55f8461576e0: i64 = Register %vreg50
try.c: 0x55f846103e50: i64 = undef
try.c: 0x55f846157480: v4i64,ch = CopyFromReg 0x55f84606c950, Register:v4i64 %vreg13
try.c: 0x55f84615c140: v4i64 = Register %vreg13
try.c: 0x55f846116220: v16i32 = X86ISD::VBROADCAST 0x55f84615bb50
try.c: 0x55f84615bb50: i32,ch = load<LD4[ConstantPool]> 0x55f84606c950, 0x55f8461060b0, undef:i64
try.c: 0x55f8461060b0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<i32 1> 0
try.c: 0x55f8461455d0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x55f846103e50: i64 = undef
try.c: 0x55f846176020: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x55f846175ef0: i32 = Constant<0>
try.c: 0x55f846175ef0: i32 = Constant<0>
try.c: 0x55f846175ef0: i32 = Constant<0>
try.c: 0x55f846175ef0: i32 = Constant<0>
try.c: 0x55f846175ef0: i32 = Constant<0>
try.c: 0x55f846175ef0: i32 = Constant<0>
try.c: 0x55f846175ef0: i32 = Constant<0>
try.c: 0x55f846175ef0: i32 = Constant<0>
try.c: 0x55f846175ef0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref