Implementation notes: amd64, saber214, crypto_core/invsntrup1013

Computer: saber214
Microarchitecture: amd64; Bulldozer (600f20)
Architecture: amd64
CPU ID: AuthenticAMD-00600f20-1789c3f5
SUPERCOP version: 20240625
Operation: crypto_core
Primitive: invsntrup1013
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
232730445222 0 018520 784 832refgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
751390232113 0 016934 824 776refclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
751647222113 0 014542 824 760refclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
965012603648 0 017182 824 760refclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
1159257691064 0 011840 816 760refclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
1168201421169 0 012302 824 760refclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
1246522931055 0 012912 784 832refgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
1344391401162 0 012757 768 832refgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
135876110947 0 011567 768 800refgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625

Compiler output


recip.c: recip.c:96:23: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx2'
recip.c:   __m256i f0vecqinv = _mm256_mullo_epi16(f0vec,qinvvec);
recip.c:                       ^
recip.c: recip.c:97:23: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx2'
recip.c:   __m256i g0vecqinv = _mm256_mullo_epi16(g0vec,qinvvec);
recip.c:                       ^
recip.c: recip.c:103:21: error: always_inline function '_mm256_blendv_epi8' requires target feature 'avx2', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx2'
recip.c:     __m256i finew = _mm256_blendv_epi8(fi,gi,maskvec);
recip.c:                     ^
recip.c: recip.c:104:21: error: always_inline function '_mm256_blendv_epi8' requires target feature 'avx2', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx2'
recip.c:     __m256i ginew = _mm256_blendv_epi8(gi,fi,maskvec);
recip.c:                     ^
recip.c: recip.c:105:13: error: always_inline function '_mm256_sub_epi16' requires target feature 'avx2', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx2'
recip.c:     ginew = _mm256_sub_epi16(montproduct(ginew,f0vec,f0vecqinv),montproduct(finew,g0vec,g0vecqinv));
recip.c:             ^
recip.c: recip.c:86:7: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'montproduct' that is compiled without support for 'avx2'
recip.c:   d = _mm256_mullo_epi16(x,yqinv);
recip.c:       ^
recip.c: recip.c:87:8: error: always_inline function '_mm256_mulhi_epi16' requires target feature 'avx2', but would be inlined into function 'montproduct' that is compiled without support for 'avx2'
recip.c:   hi = _mm256_mulhi_epi16(x,y);
recip.c:        ^
recip.c: recip.c:88:7: error: always_inline function '_mm256_mulhi_epi16' requires target feature 'avx2', but would be inlined into function 'montproduct' that is compiled without support for 'avx2'
recip.c:   e = _mm256_mulhi_epi16(d,qvec);
recip.c:       ^
recip.c: recip.c:89:10: error: always_inline function '_mm256_sub_epi16' requires target feature 'avx2', but would be inlined into function 'montproduct' that is compiled without support for 'avx2'
recip.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
avxclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
avxclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
avxclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
avxclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


recip.c: recip.c:94:19: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx'
recip.c:   __m256i f0vec = _mm256_set1_epi16(f0);
recip.c:                   ^
recip.c: recip.c:94:19: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
recip.c: recip.c:95:19: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx'
recip.c:   __m256i g0vec = _mm256_set1_epi16(g0);
recip.c:                   ^
recip.c: recip.c:95:19: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
recip.c: recip.c:96:48: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx'
recip.c:   __m256i f0vecqinv = _mm256_mullo_epi16(f0vec,qinvvec);
recip.c:                                                ^
recip.c: recip.c:80:17: note: expanded from macro 'qinvvec'
recip.c: #define qinvvec _mm256_set1_epi16(qinv)
recip.c:                 ^
recip.c: recip.c:96:48: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
recip.c: recip.c:80:17: note: expanded from macro 'qinvvec'
recip.c: #define qinvvec _mm256_set1_epi16(qinv)
recip.c:                 ^
recip.c: recip.c:96:23: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx2'
recip.c:   __m256i f0vecqinv = _mm256_mullo_epi16(f0vec,qinvvec);
recip.c:                       ^
recip.c: recip.c:96:23: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
recip.c: recip.c:97:48: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx'
recip.c:   __m256i g0vecqinv = _mm256_mullo_epi16(g0vec,qinvvec);
recip.c:                                                ^
recip.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
avxclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


recip.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
recip.c:                  from recip.c:1:
recip.c: recip.c: In function 'montproduct':
recip.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:808:1: error: inlining failed in call to 'always_inline' '_mm256_sub_epi16': target specific option mismatch
recip.c:   808 | _mm256_sub_epi16 (__m256i __A, __m256i __B)
recip.c:       | ^~~~~~~~~~~~~~~~
recip.c: recip.c:89:10: note: called from here
recip.c:    89 |   return _mm256_sub_epi16(hi,e);
recip.c:       |          ^~~~~~~~~~~~~~~~~~~~~~
recip.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
recip.c:                  from recip.c:1:
recip.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:546:1: error: inlining failed in call to 'always_inline' '_mm256_mulhi_epi16': target specific option mismatch
recip.c:   546 | _mm256_mulhi_epi16 (__m256i __A, __m256i __B)
recip.c:       | ^~~~~~~~~~~~~~~~~~
recip.c: recip.c:88:7: note: called from here
recip.c:    88 |   e = _mm256_mulhi_epi16(d,qvec);
recip.c:       |       ^~~~~~~~~~~~~~~~~~~~~~~~~~
recip.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
recip.c:                  from recip.c:1:
recip.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:546:1: error: inlining failed in call to 'always_inline' '_mm256_mulhi_epi16': target specific option mismatch
recip.c:   546 | _mm256_mulhi_epi16 (__m256i __A, __m256i __B)
recip.c:       | ^~~~~~~~~~~~~~~~~~
recip.c: recip.c:87:8: note: called from here
recip.c:    87 |   hi = _mm256_mulhi_epi16(x,y);
recip.c:       |        ^~~~~~~~~~~~~~~~~~~~~~~
recip.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
avxgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
avxgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
avxgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
avxgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

TIMECOP error (can be valgrind bug)


Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x10ABD5
   at 0x...: int16_negative_mask (recip.c:35)
   by 0x...: crypto_core_invsntrup1013_ref_constbranchindex (recip.c:95)
   by 0x...: test (try.c:106)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
refclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

TIMECOP error (can be valgrind bug)


Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x10AC75
   at 0x...: int16_negative_mask (recip.c:35)
   by 0x...: crypto_core_invsntrup1013_ref_constbranchindex (recip.c:95)
   by 0x...: test (try.c:106)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
refclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

TIMECOP error (can be valgrind bug)


Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x10A603
   at 0x...: int16_negative_mask (recip.c:35)
   by 0x...: crypto_core_invsntrup1013_ref_constbranchindex (recip.c:95)
   by 0x...: test (try.c:106)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
refclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

TIMECOP error (can be valgrind bug)


Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x10A34A
   at 0x...: int16_negative_mask (recip.c:35)
   by 0x...: crypto_core_invsntrup1013_ref_constbranchindex (recip.c:95)
   by 0x...: test (try.c:106)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
refclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

TIMECOP error (can be valgrind bug)


Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x109FFD
   at 0x...: salsa20.part.0 (try-anything.c:102)
   by 0x...: salsa20 (try-anything.c:85)
   by 0x...: canary (try-anything.c:148)
   by 0x...: output_prepare (try-anything.c:178)
   by 0x...: test (try.c:99)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
refgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

TIMECOP error (can be valgrind bug)


Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x109C4A
   at 0x...: core (try-anything.c:53)
   by 0x...: salsa20.part.0 (try-anything.c:89)
   by 0x...: salsa20 (try-anything.c:85)
   by 0x...: canary (try-anything.c:148)
   by 0x...: output_prepare (try-anything.c:178)
   by 0x...: test (try.c:99)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
refgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

TIMECOP error (can be valgrind bug)


Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x109804
   at 0x...: memcpy (string_fortified.h:29)
   by 0x...: test (try.c:149)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
refgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Passed TIMECOP


TIMECOP iterations: 1

Number of similar (implementation,compiler) pairs: 2, namely:
ImplementationCompiler
refclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
refgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)