Implementation notes: amd64, hydra8, crypto_kem/firesaber2

Computer: hydra8
Microarchitecture: amd64; Ivy Bridge+AES (306a9)
Architecture: amd64
CPU ID: GenuineIntel-000306a9-bfebfbff
SUPERCOP version: 20240625
Operation: crypto_kem
Primitive: firesaber2
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
63718997905 0 0115487 876 1728T:refclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
640818137777 0 0157335 876 1728T:refclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
677668121711 0 0140351 876 1728T:refclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
71122159548 0 077892 820 1760T:refgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
228512814232 0 030433 868 1728T:refclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
235266117539 0 034231 876 1728T:refclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
249602014101 0 031044 820 1760T:refgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
270639015808 0 033188 820 1760T:refgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
295243313080 0 028892 812 1728T:refgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625

Compiler output


SABER_indcpa.c: In file included from SABER_indcpa.c:9:
SABER_indcpa.c: In file included from ././polymul/toom-cook_4way.c:6:
SABER_indcpa.c: ././polymul/scm_avx.c:43:9: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'schoolbook_avx_new3_acc' that is compiled without support for 'avx2'
SABER_indcpa.c:         temp = _mm256_mullo_epi16 (a0, b1);
SABER_indcpa.c:                ^
SABER_indcpa.c: ././polymul/scm_avx.c:45:13: error: always_inline function '_mm256_add_epi16' requires target feature 'avx2', but would be inlined into function 'schoolbook_avx_new3_acc' that is compiled without support for 'avx2'
SABER_indcpa.c:         c_avx[1] = _mm256_add_epi16(temp, c_avx[1]);
SABER_indcpa.c:                    ^
SABER_indcpa.c: ././polymul/scm_avx.c:48:9: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'schoolbook_avx_new3_acc' that is compiled without support for 'avx2'
SABER_indcpa.c:         temp = _mm256_mullo_epi16 (a0, b2);
SABER_indcpa.c:                ^
SABER_indcpa.c: ././polymul/scm_avx.c:51:13: error: always_inline function '_mm256_add_epi16' requires target feature 'avx2', but would be inlined into function 'schoolbook_avx_new3_acc' that is compiled without support for 'avx2'
SABER_indcpa.c:         c_avx[2] = _mm256_add_epi16(temp, c_avx[2]);
SABER_indcpa.c:                    ^
SABER_indcpa.c: ././polymul/scm_avx.c:54:9: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'schoolbook_avx_new3_acc' that is compiled without support for 'avx2'
SABER_indcpa.c:         temp = _mm256_mullo_epi16 (a0, b3);
SABER_indcpa.c:                ^
SABER_indcpa.c: ././polymul/scm_avx.c:58:13: error: always_inline function '_mm256_add_epi16' requires target feature 'avx2', but would be inlined into function 'schoolbook_avx_new3_acc' that is compiled without support for 'avx2'
SABER_indcpa.c:         c_avx[3] = _mm256_add_epi16(temp, c_avx[3]);
SABER_indcpa.c:                    ^
SABER_indcpa.c: ././polymul/scm_avx.c:60:9: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'schoolbook_avx_new3_acc' that is compiled without support for 'avx2'
SABER_indcpa.c:         temp = _mm256_mullo_epi16 (a0, b4);
SABER_indcpa.c:                ^
SABER_indcpa.c: ././polymul/scm_avx.c:65:13: error: always_inline function '_mm256_add_epi16' requires target feature 'avx2', but would be inlined into function 'schoolbook_avx_new3_acc' that is compiled without support for 'avx2'
SABER_indcpa.c:         c_avx[4] = _mm256_add_epi16(temp, c_avx[4]);
SABER_indcpa.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:avx2clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:avx2clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:avx2clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:avx2clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


SABER_indcpa.c: In file included from SABER_indcpa.c:9:
SABER_indcpa.c: In file included from ././polymul/toom-cook_4way.c:6:
SABER_indcpa.c: ././polymul/scm_avx.c:40:13: warning: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI [-Wpsabi]
SABER_indcpa.c:         c_avx[0] = mul_add(a0, b0, c_avx[0]);
SABER_indcpa.c:                    ^
SABER_indcpa.c: ././polymul/scm_avx.c:43:9: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'schoolbook_avx_new3_acc' that is compiled without support for 'avx2'
SABER_indcpa.c:         temp = _mm256_mullo_epi16 (a0, b1);
SABER_indcpa.c:                ^
SABER_indcpa.c: ././polymul/scm_avx.c:43:9: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
SABER_indcpa.c: ././polymul/scm_avx.c:44:7: warning: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI [-Wpsabi]
SABER_indcpa.c:         temp=mul_add(a1, b0, temp);
SABER_indcpa.c:              ^
SABER_indcpa.c: ././polymul/scm_avx.c:45:13: error: always_inline function '_mm256_add_epi16' requires target feature 'avx2', but would be inlined into function 'schoolbook_avx_new3_acc' that is compiled without support for 'avx2'
SABER_indcpa.c:         c_avx[1] = _mm256_add_epi16(temp, c_avx[1]);
SABER_indcpa.c:                    ^
SABER_indcpa.c: ././polymul/scm_avx.c:45:13: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
SABER_indcpa.c: ././polymul/scm_avx.c:48:9: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'schoolbook_avx_new3_acc' that is compiled without support for 'avx2'
SABER_indcpa.c:         temp = _mm256_mullo_epi16 (a0, b2);
SABER_indcpa.c:                ^
SABER_indcpa.c: ././polymul/scm_avx.c:48:9: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
SABER_indcpa.c: ././polymul/scm_avx.c:49:9: warning: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI [-Wpsabi]
SABER_indcpa.c:         temp = mul_add(a1, b1, temp);
SABER_indcpa.c:                ^
SABER_indcpa.c: ././polymul/scm_avx.c:50:7: warning: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI [-Wpsabi]
SABER_indcpa.c:         temp=mul_add(a2, b0, temp);
SABER_indcpa.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:avx2clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


SABER_indcpa.c: SABER_indcpa.c: In function 'indcpa_kem_enc':
SABER_indcpa.c: SABER_indcpa.c:269:26: warning: unused variable 'CLOCK2' [-Wunused-variable]
SABER_indcpa.c:   269 |         uint64_t CLOCK1, CLOCK2;
SABER_indcpa.c:       |                          ^~~~~~
SABER_indcpa.c: SABER_indcpa.c:269:18: warning: unused variable 'CLOCK1' [-Wunused-variable]
SABER_indcpa.c:   269 |         uint64_t CLOCK1, CLOCK2;
SABER_indcpa.c:       |                  ^~~~~~
SABER_indcpa.c: SABER_indcpa.c: In function 'indcpa_kem_dec':
SABER_indcpa.c: SABER_indcpa.c:436:26: warning: unused variable 'CLOCK2' [-Wunused-variable]
SABER_indcpa.c:   436 |         uint64_t CLOCK1, CLOCK2;
SABER_indcpa.c:       |                          ^~~~~~
SABER_indcpa.c: SABER_indcpa.c:436:18: warning: unused variable 'CLOCK1' [-Wunused-variable]
SABER_indcpa.c:   436 |         uint64_t CLOCK1, CLOCK2;
SABER_indcpa.c:       |                  ^~~~~~
SABER_indcpa.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
SABER_indcpa.c:                  from SABER_indcpa.h:4,
SABER_indcpa.c:                  from SABER_indcpa.c:5:
SABER_indcpa.c: ./polymul/scm_avx.c: In function 'mul_add':
SABER_indcpa.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:112:1: error: inlining failed in call to 'always_inline' '_mm256_add_epi16': target specific option mismatch
SABER_indcpa.c:   112 | _mm256_add_epi16 (__m256i __A, __m256i __B)
SABER_indcpa.c:       | ^~~~~~~~~~~~~~~~
SABER_indcpa.c: In file included from ./polymul/toom-cook_4way.c:6,
SABER_indcpa.c:                  from SABER_indcpa.c:9:
SABER_indcpa.c: ./polymul/scm_avx.c:7:12: note: called from here
SABER_indcpa.c:     7 |     return _mm256_add_epi16(_mm256_mullo_epi16(a, b), c);
SABER_indcpa.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:avx2gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Compiler output


poly.c: poly.c:43:10: error: always_inline function '_mm256_sub_epi16' requires target feature 'avx2', but would be inlined into function 'nttmul_poly_crt' that is compiled without support for 'avx2'
poly.c:     f1 = _mm256_sub_epi16(f1,f0);
poly.c:          ^
poly.c: poly.c:45:10: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'nttmul_poly_crt' that is compiled without support for 'avx2'
poly.c:     f1 = _mm256_mullo_epi16(f1,p0);
poly.c:          ^
poly.c: poly.c:46:10: error: always_inline function '_mm256_add_epi16' requires target feature 'avx2', but would be inlined into function 'nttmul_poly_crt' that is compiled without support for 'avx2'
poly.c:     f0 = _mm256_add_epi16(f0,f1);
poly.c:          ^
poly.c: poly.c:47:10: error: always_inline function '_mm256_and_si256' requires target feature 'avx2', but would be inlined into function 'nttmul_poly_crt' that is compiled without support for 'avx2'
poly.c:     f0 = _mm256_and_si256(f0,mod);
poly.c:          ^
poly.c: 4 errors generated.

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:avx2_nttmulclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:avx2_nttmulclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:avx2_nttmulclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:avx2_nttmulclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


poly.c: poly.c:31:26: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'nttmul_poly_crt' that is compiled without support for 'avx'
poly.c:   const __m256i u_pinv = _mm256_set1_epi16(CRT_U_PINV);
poly.c:                          ^
poly.c: poly.c:31:26: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly.c: poly.c:32:21: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'nttmul_poly_crt' that is compiled without support for 'avx'
poly.c:   const __m256i u = _mm256_set1_epi16(CRT_U);
poly.c:                     ^
poly.c: poly.c:32:21: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly.c: poly.c:33:22: error: always_inline function '_mm256_load_si256' requires target feature 'avx', but would be inlined into function 'nttmul_poly_crt' that is compiled without support for 'avx'
poly.c:   const __m256i p0 = _mm256_load_si256((__m256i *)&PDATA0[_16XP]);
poly.c:                      ^
poly.c: poly.c:33:22: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly.c: poly.c:34:22: error: always_inline function '_mm256_load_si256' requires target feature 'avx', but would be inlined into function 'nttmul_poly_crt' that is compiled without support for 'avx'
poly.c:   const __m256i p1 = _mm256_load_si256((__m256i *)&PDATA1[_16XP]);
poly.c:                      ^
poly.c: poly.c:34:22: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly.c: poly.c:35:23: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'nttmul_poly_crt' that is compiled without support for 'avx'
poly.c:   const __m256i mod = _mm256_set1_epi16(KEM_Q-1);
poly.c:                       ^
poly.c: poly.c:35:23: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly.c: poly.c:36:30: error: always_inline function '_mm256_load_si256' requires target feature 'avx', but would be inlined into function 'nttmul_poly_crt' that is compiled without support for 'avx'
poly.c:   const __m256i mont0_pinv = _mm256_load_si256((__m256i *)&PDATA0[_16XMONT_PINV]);
poly.c:                              ^
poly.c: poly.c:36:30: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly.c: poly.c:37:25: error: always_inline function '_mm256_load_si256' requires target feature 'avx', but would be inlined into function 'nttmul_poly_crt' that is compiled without support for 'avx'
poly.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:avx2_nttmulclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


poly.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
poly.c:                  from poly.c:3:
poly.c: poly.c: In function 'mulmod':
poly.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:808:1: error: inlining failed in call to 'always_inline' '_mm256_sub_epi16': target specific option mismatch
poly.c:   808 | _mm256_sub_epi16 (__m256i __A, __m256i __B)
poly.c:       | ^~~~~~~~~~~~~~~~
poly.c: poly.c:12:7: note: called from here
poly.c:    12 |   t = _mm256_sub_epi16(u,t);
poly.c:       |       ^~~~~~~~~~~~~~~~~~~~~
poly.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
poly.c:                  from poly.c:3:
poly.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:546:1: error: inlining failed in call to 'always_inline' '_mm256_mulhi_epi16': target specific option mismatch
poly.c:   546 | _mm256_mulhi_epi16 (__m256i __A, __m256i __B)
poly.c:       | ^~~~~~~~~~~~~~~~~~
poly.c: poly.c:11:7: note: called from here
poly.c:    11 |   t = _mm256_mulhi_epi16(t,p);
poly.c:       |       ^~~~~~~~~~~~~~~~~~~~~~~
poly.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
poly.c:                  from poly.c:3:
poly.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:546:1: error: inlining failed in call to 'always_inline' '_mm256_mulhi_epi16': target specific option mismatch
poly.c:   546 | _mm256_mulhi_epi16 (__m256i __A, __m256i __B)
poly.c:       | ^~~~~~~~~~~~~~~~~~
poly.c: poly.c:10:7: note: called from here
poly.c:    10 |   u = _mm256_mulhi_epi16(a,b);
poly.c:       |       ^~~~~~~~~~~~~~~~~~~~~~~
poly.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:avx2_nttmulgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2_nttmulgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2_nttmulgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2_nttmulgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)