Implementation notes: amd64, shoe, crypto_core/multsntrup653

Computer: shoe
Microarchitecture: amd64; Broadwell+AES (306d4)
Architecture: amd64
CPU ID: GenuineIntel-000306d4-bfebfbff
SUPERCOP version: 20240425
Operation: crypto_core
Primitive: multsntrup653
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
1641119932 0 034504 812 952avx800clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
1645220870 0 035528 812 952avx800clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
1653220276 0 034848 812 952avxclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
1653721214 0 035872 812 952avxclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
1710219481 0 033352 780 984avx800gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042720240425
1733618150 0 032776 812 952round2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
1736517228 0 031768 812 952round2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
1741620009 0 033880 780 984avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042720240425
1840417006 0 030880 780 984round2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042720240425
1841917663 0 028950 804 920avx800clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
1852917886 0 029174 804 920avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
1885818530 0 030303 772 984avx800gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042720240425
1907018101 0 028919 764 952avx800gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042720240425
1940413814 0 025070 804 920round2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
1954119133 0 030911 772 984avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042720240425
1966218425 0 030536 780 984avx800gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042720240425
1980818696 0 029519 764 952avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042720240425
2013418953 0 031064 780 984avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042720240425
2038015075 0 026855 772 984round2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042720240425
2100114966 0 027080 780 984round2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042720240425
2165814242 0 025087 764 952round2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042720240425
2201918697 0 030214 804 920avx800clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
6354228750 0 040302 804 920avxclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
6412418880 0 030454 804 920round2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
1529565227 0 019080 780 984refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042720240425
2562653533 0 018248 812 952refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
2569142957 0 017560 812 952refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
4204893579 0 017528 812 920refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
816638591 0 011918 804 920refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
821925573 0 012142 804 920refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042720240425
866439534 0 012271 772 984refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042720240425
871991598 0 012672 780 984refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042720240425
1017421460 0 011255 764 952refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042720240425

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
mult768.c: mult768.c:265:7: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup653_avx_constbranchindex' that is compiled without support for 'avx'
mult768.c: x = const_x16(0);
mult768.c: ^
mult768.c: mult768.c:10:19: note: expanded from macro 'const_x16'
mult768.c: #define const_x16 _mm256_set1_epi16
mult768.c: ^
mult768.c: mult768.c:265:7: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult768.c: mult768.c:10:19: note: expanded from macro 'const_x16'
mult768.c: #define const_x16 _mm256_set1_epi16
mult768.c: ^
mult768.c: mult768.c:266:35: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup653_avx_constbranchindex' that is compiled without support for 'avx'
mult768.c: for (i = p&~15;i < 768;i += 16) store_x16(&f[i],x);
mult768.c: ^
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ^
mult768.c: mult768.c:266:35: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ^
mult768.c: mult768.c:267:35: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup653_avx_constbranchindex' that is compiled without support for 'avx'
mult768.c: for (i = p&~15;i < 768;i += 16) store_x16(&g[i],x);
mult768.c: ^
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx800
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
mult768.c: mult768.c:265:7: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup653_avx800_constbranchindex' that is compiled without support for 'avx'
mult768.c: x = const_x16(0);
mult768.c: ^
mult768.c: mult768.c:10:19: note: expanded from macro 'const_x16'
mult768.c: #define const_x16 _mm256_set1_epi16
mult768.c: ^
mult768.c: mult768.c:265:7: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult768.c: mult768.c:10:19: note: expanded from macro 'const_x16'
mult768.c: #define const_x16 _mm256_set1_epi16
mult768.c: ^
mult768.c: mult768.c:266:35: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup653_avx800_constbranchindex' that is compiled without support for 'avx'
mult768.c: for (i = p&~15;i < 768;i += 16) store_x16(&f[i],x);
mult768.c: ^
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ^
mult768.c: mult768.c:266:35: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ^
mult768.c: mult768.c:267:35: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup653_avx800_constbranchindex' that is compiled without support for 'avx'
mult768.c: for (i = p&~15;i < 768;i += 16) store_x16(&g[i],x);
mult768.c: ^
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx800

Compiler output

Implementation: round2
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
mult768.c: mult768.c:266:7: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup653_round2_constbranchindex' that is compiled without support for 'avx'
mult768.c: x = const_x16(0);
mult768.c: ^
mult768.c: mult768.c:10:19: note: expanded from macro 'const_x16'
mult768.c: #define const_x16 _mm256_set1_epi16
mult768.c: ^
mult768.c: mult768.c:266:7: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult768.c: mult768.c:10:19: note: expanded from macro 'const_x16'
mult768.c: #define const_x16 _mm256_set1_epi16
mult768.c: ^
mult768.c: mult768.c:267:35: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup653_round2_constbranchindex' that is compiled without support for 'avx'
mult768.c: for (i = p&~15;i < 768;i += 16) store_x16(&f[i],x);
mult768.c: ^
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ^
mult768.c: mult768.c:267:35: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ^
mult768.c: mult768.c:268:35: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup653_round2_constbranchindex' that is compiled without support for 'avx'
mult768.c: for (i = p&~15;i < 768;i += 16) store_x16(&g[i],x);
mult768.c: ^
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE round2