Implementation notes: amd64, comet, crypto_core/multsntrup857

Computer: comet
Microarchitecture: amd64; Comet Lake (806ec)
Architecture: amd64
CPU ID: GenuineIntel-000806ec-bfebfbff
SUPERCOP version: 20240107
Operation: crypto_core
Primitive: multsntrup857
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
1946723758 0 039473 852 960avx800clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
2007124214 0 040225 852 1024avxclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
2008623822 0 039833 852 1024avx800clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
2010324150 0 039865 852 960avxclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
2127321012 0 035388 780 992avx800gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010520231222
2143421094 0 036849 852 1024round2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
2145521030 0 036489 852 960round2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
2175221532 0 033545 852 928avxclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
2177221460 0 035836 780 992avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010520231222
2186321657 0 033673 852 928avx800clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
2186320403 0 033311 844 1024avx800clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
2213520658 0 033559 844 1024avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
2295318717 0 032708 780 992round2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010520231222
2306918020 0 029713 852 928round2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
2361116035 0 028583 844 1024round2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
2508820610 0 033068 780 992avx800gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010520231222
2514121058 0 033516 780 992avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010520231222
2630620013 0 031043 764 960avx800gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010520231222
2660917316 0 029388 780 992round2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010520231222
2708420594 0 031627 764 960avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010520231222
2835520592 0 032603 772 992avx800gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010520231222
2889121090 0 033099 772 992avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010520231222
2980416252 0 026907 764 960round2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010520231222
3031317298 0 028963 772 992round2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010520231222
1998084313 0 018236 780 992refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010520231222
3574212413 0 017857 852 960refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
3642903117 0 018857 852 1024refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
7962941787 0 015897 852 928refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
1446134591 0 013127 844 1024refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
1514918986 0 013012 780 992refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010520231222
1517700634 0 012329 852 928refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010520231222
1542026537 0 012139 772 992refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010520231222
1728107470 0 011027 764 960refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010520231222

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
mult1024.c: mult1024.c:307:7: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup857_avx_constbranchindex' that is compiled without support for 'avx'
mult1024.c: x = const_x16(0);
mult1024.c: ^
mult1024.c: mult1024.c:10:19: note: expanded from macro 'const_x16'
mult1024.c: #define const_x16 _mm256_set1_epi16
mult1024.c: ^
mult1024.c: mult1024.c:307:7: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult1024.c: mult1024.c:10:19: note: expanded from macro 'const_x16'
mult1024.c: #define const_x16 _mm256_set1_epi16
mult1024.c: ^
mult1024.c: mult1024.c:308:36: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup857_avx_constbranchindex' that is compiled without support for 'avx'
mult1024.c: for (i = p&~15;i < 1024;i += 16) store_x16(&f[i],x);
mult1024.c: ^
mult1024.c: mult1024.c:9:24: note: expanded from macro 'store_x16'
mult1024.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult1024.c: ^
mult1024.c: mult1024.c:308:36: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult1024.c: mult1024.c:9:24: note: expanded from macro 'store_x16'
mult1024.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult1024.c: ^
mult1024.c: mult1024.c:309:36: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup857_avx_constbranchindex' that is compiled without support for 'avx'
mult1024.c: for (i = p&~15;i < 1024;i += 16) store_x16(&g[i],x);
mult1024.c: ^
mult1024.c: mult1024.c:9:24: note: expanded from macro 'store_x16'
mult1024.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult1024.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx800
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
mult1024.c: mult1024.c:307:7: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup857_avx800_constbranchindex' that is compiled without support for 'avx'
mult1024.c: x = const_x16(0);
mult1024.c: ^
mult1024.c: mult1024.c:10:19: note: expanded from macro 'const_x16'
mult1024.c: #define const_x16 _mm256_set1_epi16
mult1024.c: ^
mult1024.c: mult1024.c:307:7: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult1024.c: mult1024.c:10:19: note: expanded from macro 'const_x16'
mult1024.c: #define const_x16 _mm256_set1_epi16
mult1024.c: ^
mult1024.c: mult1024.c:308:36: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup857_avx800_constbranchindex' that is compiled without support for 'avx'
mult1024.c: for (i = p&~15;i < 1024;i += 16) store_x16(&f[i],x);
mult1024.c: ^
mult1024.c: mult1024.c:9:24: note: expanded from macro 'store_x16'
mult1024.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult1024.c: ^
mult1024.c: mult1024.c:308:36: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult1024.c: mult1024.c:9:24: note: expanded from macro 'store_x16'
mult1024.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult1024.c: ^
mult1024.c: mult1024.c:309:36: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup857_avx800_constbranchindex' that is compiled without support for 'avx'
mult1024.c: for (i = p&~15;i < 1024;i += 16) store_x16(&g[i],x);
mult1024.c: ^
mult1024.c: mult1024.c:9:24: note: expanded from macro 'store_x16'
mult1024.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult1024.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx800

Compiler output

Implementation: round2
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
mult1024.c: mult1024.c:308:7: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup857_round2_constbranchindex' that is compiled without support for 'avx'
mult1024.c: x = const_x16(0);
mult1024.c: ^
mult1024.c: mult1024.c:10:19: note: expanded from macro 'const_x16'
mult1024.c: #define const_x16 _mm256_set1_epi16
mult1024.c: ^
mult1024.c: mult1024.c:308:7: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult1024.c: mult1024.c:10:19: note: expanded from macro 'const_x16'
mult1024.c: #define const_x16 _mm256_set1_epi16
mult1024.c: ^
mult1024.c: mult1024.c:309:36: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup857_round2_constbranchindex' that is compiled without support for 'avx'
mult1024.c: for (i = p&~15;i < 1024;i += 16) store_x16(&f[i],x);
mult1024.c: ^
mult1024.c: mult1024.c:9:24: note: expanded from macro 'store_x16'
mult1024.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult1024.c: ^
mult1024.c: mult1024.c:309:36: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult1024.c: mult1024.c:9:24: note: expanded from macro 'store_x16'
mult1024.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult1024.c: ^
mult1024.c: mult1024.c:310:36: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup857_round2_constbranchindex' that is compiled without support for 'avx'
mult1024.c: for (i = p&~15;i < 1024;i += 16) store_x16(&g[i],x);
mult1024.c: ^
mult1024.c: mult1024.c:9:24: note: expanded from macro 'store_x16'
mult1024.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult1024.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE round2