Implementation notes: amd64, rumba7, crypto_core/mult3sntrup653

Computer: rumba7
Microarchitecture: amd64; Zen (800f11)
Architecture: amd64
CPU ID: AuthenticAMD-00800f11-178bfbff
SUPERCOP version: 20240107
Operation: crypto_core
Primitive: mult3sntrup653
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
1477522507 0 035115 844 960avxclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
1492419868 0 034835 844 960avx800clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
1494915618 0 030203 844 960avx800clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
1498018241 0 030467 844 960avxclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
1522314971 0 028781 804 992avx800gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121220231212
1581016564 0 030749 804 992avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121220231212
1684820119 0 033747 844 960round2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
1704415869 0 029115 844 960round2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
1817416444 0 030669 804 992round2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121220231212
1856613263 0 025509 836 1024avx800clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
1858615176 0 025413 836 1024avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
1901113929 0 025499 844 928avx800clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
1902816474 0 025683 844 928avxclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
1983313844 0 025900 796 992avx800gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121220231212
2043015477 0 027884 796 992avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121220231212
2068412340 0 023051 844 928round2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
2098811037 0 022741 836 1024round2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
2130113478 0 024440 780 960avx800gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121220231212
2147313633 0 026213 804 992avx800gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121220231212
2202115226 0 028181 804 992avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121220231212
2202314267 0 025552 780 960avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121220231212
2257011608 0 024084 796 992round2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121220231212
2433111359 0 022704 780 960round2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121220231212
2441313052 0 026045 804 992round2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121220231212
1541714623 0 018837 804 992refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121220231212
1673293536 0 018203 844 960refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
1687913536 0 018579 844 960refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
2459194638 0 018611 844 928refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
986855660 0 013661 804 992refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121220231212
1028191471 0 012781 836 1024refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
1079784504 0 012051 844 928refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121220231212
1591745692 0 013164 796 992refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121220231212
2038109556 0 011904 780 960refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121220231212

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
mult768.c: mult768.c:208:7: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'crypto_core_mult3sntrup653_avx_constbranchindex' that is compiled without support for 'avx'
mult768.c: x = const_x16(0);
mult768.c: ^
mult768.c: mult768.c:10:19: note: expanded from macro 'const_x16'
mult768.c: #define const_x16 _mm256_set1_epi16
mult768.c: ^
mult768.c: mult768.c:208:7: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult768.c: mult768.c:10:19: note: expanded from macro 'const_x16'
mult768.c: #define const_x16 _mm256_set1_epi16
mult768.c: ^
mult768.c: mult768.c:209:35: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_mult3sntrup653_avx_constbranchindex' that is compiled without support for 'avx'
mult768.c: for (i = p&~15;i < 768;i += 16) store_x16(&f[i],x);
mult768.c: ^
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ^
mult768.c: mult768.c:209:35: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ^
mult768.c: mult768.c:210:35: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_mult3sntrup653_avx_constbranchindex' that is compiled without support for 'avx'
mult768.c: for (i = p&~15;i < 768;i += 16) store_x16(&g[i],x);
mult768.c: ^
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx800
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
mult768.c: mult768.c:208:7: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'crypto_core_mult3sntrup653_avx800_constbranchindex' that is compiled without support for 'avx'
mult768.c: x = const_x16(0);
mult768.c: ^
mult768.c: mult768.c:10:19: note: expanded from macro 'const_x16'
mult768.c: #define const_x16 _mm256_set1_epi16
mult768.c: ^
mult768.c: mult768.c:208:7: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult768.c: mult768.c:10:19: note: expanded from macro 'const_x16'
mult768.c: #define const_x16 _mm256_set1_epi16
mult768.c: ^
mult768.c: mult768.c:209:35: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_mult3sntrup653_avx800_constbranchindex' that is compiled without support for 'avx'
mult768.c: for (i = p&~15;i < 768;i += 16) store_x16(&f[i],x);
mult768.c: ^
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ^
mult768.c: mult768.c:209:35: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ^
mult768.c: mult768.c:210:35: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_mult3sntrup653_avx800_constbranchindex' that is compiled without support for 'avx'
mult768.c: for (i = p&~15;i < 768;i += 16) store_x16(&g[i],x);
mult768.c: ^
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx800

Compiler output

Implementation: round2
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
mult768.c: mult768.c:208:7: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'crypto_core_mult3sntrup653_round2_constbranchindex' that is compiled without support for 'avx'
mult768.c: x = const_x16(0);
mult768.c: ^
mult768.c: mult768.c:10:19: note: expanded from macro 'const_x16'
mult768.c: #define const_x16 _mm256_set1_epi16
mult768.c: ^
mult768.c: mult768.c:208:7: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult768.c: mult768.c:10:19: note: expanded from macro 'const_x16'
mult768.c: #define const_x16 _mm256_set1_epi16
mult768.c: ^
mult768.c: mult768.c:209:35: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_mult3sntrup653_round2_constbranchindex' that is compiled without support for 'avx'
mult768.c: for (i = p&~15;i < 768;i += 16) store_x16(&f[i],x);
mult768.c: ^
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ^
mult768.c: mult768.c:209:35: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ^
mult768.c: mult768.c:210:35: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_mult3sntrup653_round2_constbranchindex' that is compiled without support for 'avx'
mult768.c: for (i = p&~15;i < 768;i += 16) store_x16(&g[i],x);
mult768.c: ^
mult768.c: mult768.c:9:24: note: expanded from macro 'store_x16'
mult768.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult768.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE round2