Implementation notes: amd64, panther, crypto_core/multsntrup953

Computer: panther
Microarchitecture: amd64; Tiger Lake (806c1)
Architecture: amd64
CPU ID: GenuineIntel-000806c1-00-bfebfbff
SUPERCOP version: 20240716
Operation: crypto_core
Primitive: multsntrup953
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
1661424126 0 039085 828 952avx800clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
1661523614 0 038557 828 952avx800clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
1677823942 0 038885 828 952avxclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
1681624454 0 039413 828 952avxclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
1691820617 0 032443 820 920avx800clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
1717520793 0 032611 820 920avxclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
1797822095 0 036800 780 984avx800gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
1845422511 0 037216 780 984avxgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
2219420031 0 031295 764 952avx800gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
2263820944 0 033632 780 984avx800gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
2278520750 0 032015 764 952avxgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
2290121321 0 033467 820 920avx800clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
2314621360 0 034048 780 984avxgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
2543320712 0 032887 772 984avx800gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
2599821211 0 033383 772 984avxgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
7956431374 0 043555 820 920avxclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
1898485066 0 019416 780 984refgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
3502302311 0 016925 828 952refclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
3593113111 0 017741 828 952refclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
6342952277 0 016269 828 920refclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
1872075576 0 012195 820 920refclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
1886230612 0 012872 780 984refgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
18955261340 0 012787 820 920refclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
1977322537 0 012311 772 984refgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
2213291460 0 011295 764 952refgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716

Compiler output


mult1024.c: mult1024.c:307:7: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup953_avx_constbranchindex' that is compiled without support for 'avx'
mult1024.c:   x = const_x16(0);
mult1024.c:       ^
mult1024.c: mult1024.c:10:19: note: expanded from macro 'const_x16'
mult1024.c: #define const_x16 _mm256_set1_epi16
mult1024.c:                   ^
mult1024.c: mult1024.c:307:7: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult1024.c: mult1024.c:10:19: note: expanded from macro 'const_x16'
mult1024.c: #define const_x16 _mm256_set1_epi16
mult1024.c:                   ^
mult1024.c: mult1024.c:308:36: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup953_avx_constbranchindex' that is compiled without support for 'avx'
mult1024.c:   for (i = p&~15;i < 1024;i += 16) store_x16(&f[i],x);
mult1024.c:                                    ^
mult1024.c: mult1024.c:9:24: note: expanded from macro 'store_x16'
mult1024.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult1024.c:                        ^
mult1024.c: mult1024.c:308:36: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult1024.c: mult1024.c:9:24: note: expanded from macro 'store_x16'
mult1024.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult1024.c:                        ^
mult1024.c: mult1024.c:309:36: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup953_avx_constbranchindex' that is compiled without support for 'avx'
mult1024.c:   for (i = p&~15;i < 1024;i += 16) store_x16(&g[i],x);
mult1024.c:                                    ^
mult1024.c: mult1024.c:9:24: note: expanded from macro 'store_x16'
mult1024.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult1024.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
avxclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

Compiler output


mult1024.c: mult1024.c:307:7: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup953_avx800_constbranchindex' that is compiled without support for 'avx'
mult1024.c:   x = const_x16(0);
mult1024.c:       ^
mult1024.c: mult1024.c:10:19: note: expanded from macro 'const_x16'
mult1024.c: #define const_x16 _mm256_set1_epi16
mult1024.c:                   ^
mult1024.c: mult1024.c:307:7: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult1024.c: mult1024.c:10:19: note: expanded from macro 'const_x16'
mult1024.c: #define const_x16 _mm256_set1_epi16
mult1024.c:                   ^
mult1024.c: mult1024.c:308:36: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup953_avx800_constbranchindex' that is compiled without support for 'avx'
mult1024.c:   for (i = p&~15;i < 1024;i += 16) store_x16(&f[i],x);
mult1024.c:                                    ^
mult1024.c: mult1024.c:9:24: note: expanded from macro 'store_x16'
mult1024.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult1024.c:                        ^
mult1024.c: mult1024.c:308:36: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mult1024.c: mult1024.c:9:24: note: expanded from macro 'store_x16'
mult1024.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult1024.c:                        ^
mult1024.c: mult1024.c:309:36: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'crypto_core_multsntrup953_avx800_constbranchindex' that is compiled without support for 'avx'
mult1024.c:   for (i = p&~15;i < 1024;i += 16) store_x16(&g[i],x);
mult1024.c:                                    ^
mult1024.c: mult1024.c:9:24: note: expanded from macro 'store_x16'
mult1024.c: #define store_x16(p,v) _mm256_storeu_si256((int16x16 *) (p),(v))
mult1024.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
avx800clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

TIMECOP error (can be valgrind bug)


error 111

Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x402370
   at 0x...: st32 (try-anything.c:47)
   by 0x...: core (try-anything.c:78)
   by 0x...: salsa20 (try-anything.c:89)
   by 0x...: canary (try-anything.c:148)
   by 0x...: output_prepare (try-anything.c:178)
   by 0x...: test (try.c:99)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 3, namely:
ImplementationCompiler
avxclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx800clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
refclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

TIMECOP error (can be valgrind bug)


error 111

Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x402EDF
   at 0x...: crypto_core_multsntrup953_avx_constbranchindex (mult1024.c:321)
   by 0x...: test (try.c:106)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
avxclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

TIMECOP error (can be valgrind bug)


error 111

Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x401DC2
   at 0x...: core (try-anything.c:73)
   by 0x...: salsa20 (try-anything.c:89)
   by 0x...: canary (try-anything.c:148)
   by 0x...: output_prepare (try-anything.c:178)
   by 0x...: test (try.c:99)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 3, namely:
ImplementationCompiler
avxclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx800clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
refclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

TIMECOP error (can be valgrind bug)


error 111

Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x10A009
   at 0x...: salsa20.part.0 (try-anything.c:102)
   by 0x...: salsa20 (try-anything.c:85)
   by 0x...: canary (try-anything.c:148)
   by 0x...: output_prepare (try-anything.c:178)
   by 0x...: test (try.c:99)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 3, namely:
ImplementationCompiler
avxgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx800gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
refgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

TIMECOP error (can be valgrind bug)


error 111

Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x109EAA
   at 0x...: st32 (try-anything.c:47)
   by 0x...: core (try-anything.c:78)
   by 0x...: salsa20.part.0 (try-anything.c:89)
   by 0x...: salsa20 (try-anything.c:85)
   by 0x...: canary (try-anything.c:148)
   by 0x...: output_prepare (try-anything.c:178)
   by 0x...: test (try.c:99)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 3, namely:
ImplementationCompiler
avxgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
avx800gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
refgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

TIMECOP error (can be valgrind bug)


error 111

Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x10B1FC
   at 0x...: _mm256_add_epi16 (avx2intrin.h:114)
   by 0x...: add_x16 (ntt.c:446)
   by 0x...: ntt512 (ntt.c:999)
   by 0x...: crypto_core_multsntrup953_avx_constbranchindex_ntt512_7681 (ntt.c:1553)
   by 0x...: mult1024 (mult1024.c:166)
   by 0x...: crypto_core_multsntrup953_avx_constbranchindex (mult1024.c:324)
   by 0x...: test (try.c:106)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
avxgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

TIMECOP error (can be valgrind bug)


error 111

Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x10ABAC
   at 0x...: _mm256_sub_epi16 (avx2intrin.h:810)
   by 0x...: sub_x16 (ntt.c:451)
   by 0x...: ntt512 (ntt.c:985)
   by 0x...: mult1024 (mult1024.c:166)
   by 0x...: crypto_core_multsntrup953_avx_constbranchindex (mult1024.c:324)
   by 0x...: test (try.c:106)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
avxgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

TIMECOP error (can be valgrind bug)


error 111

Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x402EDF
   at 0x...: crypto_core_multsntrup953_avx800_constbranchindex (mult1024.c:321)
   by 0x...: test (try.c:106)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
avx800clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

TIMECOP error (can be valgrind bug)


error 111

Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x10A747
   at 0x...: _mm256_mulhrs_epi16 (avx2intrin.h:533)
   by 0x...: squeeze_7681_x16 (mult1024.c:25)
   by 0x...: mult1024 (mult1024.c:169)
   by 0x...: crypto_core_multsntrup953_avx800_constbranchindex (mult1024.c:324)
   by 0x...: test (try.c:106)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
avx800gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

TIMECOP error (can be valgrind bug)


error 111

Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x10A481
   at 0x...: _mm256_mulhrs_epi16 (avx2intrin.h:533)
   by 0x...: squeeze_7681_x16 (mult1024.c:25)
   by 0x...: mult1024 (mult1024.c:173)
   by 0x...: crypto_core_multsntrup953_avx800_constbranchindex (mult1024.c:324)
   by 0x...: test (try.c:106)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
avx800gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

TIMECOP error (can be valgrind bug)


error 111

Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x402DC4
   at 0x...: crypto_core_multsntrup953_ref_constbranchindex (mult.c:33)
   by 0x...: test (try.c:106)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
refclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

Passed TIMECOP


TIMECOP iterations: 1

Number of similar (implementation,compiler) pairs: 6, namely:
ImplementationCompiler
avxclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
avx800clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
refclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
refclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
refgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
refgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)