Implementation notes: amd64, saber214, crypto_encrypt/lotus192

Computer: saber214
Microarchitecture: amd64; Bulldozer (600f20)
Architecture: amd64
CPU ID: AuthenticAMD-00600f20-1789c3f5
SUPERCOP version: 20240808
Operation: crypto_encrypt
Primitive: lotus192
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
169309643630 0 862819 912 1632T:optgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
252963629768 0 848915 912 1632T:refgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
377818340913 0 859290 936 1568T:optclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
382293347220 0 866698 936 1568T:optclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
391314250777 0 871514 936 1600T:optclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
464350725127 0 843458 936 1568T:refclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
471591329911 0 850634 936 1600T:refclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
474036329554 0 848994 936 1568T:refclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
720496513085 0 829820 928 1568T:optclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
1022125713926 0 831731 912 1632T:optgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
1049134715230 0 832386 936 1568T:optclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
107902709739 0 826802 936 1568T:refclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
111213918752 0 825428 928 1568T:refclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
1153645710231 0 827947 912 1632T:refgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
1195632312103 0 829603 912 1632T:optgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
126124549455 0 826915 912 1632T:refgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
1444108911365 0 827859 904 1632T:optgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716
148527008834 0 825275 904 1632T:refgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072920240716

Compiler output


lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:173:14: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx2'
lwe-arithmetics_avx2.c:       b[0] = _mm256_mullo_epi16(b[0], a);
lwe-arithmetics_avx2.c:              ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:174:14: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx2'
lwe-arithmetics_avx2.c:       b[1] = _mm256_mullo_epi16(b[1], a);
lwe-arithmetics_avx2.c:              ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:175:14: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx2'
lwe-arithmetics_avx2.c:       b[2] = _mm256_mullo_epi16(b[2], a);
lwe-arithmetics_avx2.c:              ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:176:14: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx2'
lwe-arithmetics_avx2.c:       b[3] = _mm256_mullo_epi16(b[3], a);
lwe-arithmetics_avx2.c:              ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:177:14: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx2'
lwe-arithmetics_avx2.c:       b[4] = _mm256_mullo_epi16(b[4], a);
lwe-arithmetics_avx2.c:              ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:178:14: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx2'
lwe-arithmetics_avx2.c:       b[5] = _mm256_mullo_epi16(b[5], a);
lwe-arithmetics_avx2.c:              ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:179:14: error: always_inline function '_mm256_sub_epi16' requires target feature 'avx2', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx2'
lwe-arithmetics_avx2.c:       c[0] = _mm256_sub_epi16(c[0], b[0]);
lwe-arithmetics_avx2.c:              ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:180:14: error: always_inline function '_mm256_sub_epi16' requires target feature 'avx2', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx2'
lwe-arithmetics_avx2.c:       c[1] = _mm256_sub_epi16(c[1], b[1]);
lwe-arithmetics_avx2.c:              ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:181:14: error: always_inline function '_mm256_sub_epi16' requires target feature 'avx2', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx2'
lwe-arithmetics_avx2.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:avx2clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:avx2clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:avx2clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:avx2clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:158:12: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx'
lwe-arithmetics_avx2.c:     c[0] = _mm256_setzero_si256();
lwe-arithmetics_avx2.c:            ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:158:12: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:159:12: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx'
lwe-arithmetics_avx2.c:     c[1] = _mm256_setzero_si256();
lwe-arithmetics_avx2.c:            ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:159:12: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:160:12: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx'
lwe-arithmetics_avx2.c:     c[2] = _mm256_setzero_si256();
lwe-arithmetics_avx2.c:            ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:160:12: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:161:12: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx'
lwe-arithmetics_avx2.c:     c[3] = _mm256_setzero_si256();
lwe-arithmetics_avx2.c:            ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:161:12: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:162:12: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx'
lwe-arithmetics_avx2.c:     c[4] = _mm256_setzero_si256();
lwe-arithmetics_avx2.c:            ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:162:12: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:163:12: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx'
lwe-arithmetics_avx2.c:     c[5] = _mm256_setzero_si256();
lwe-arithmetics_avx2.c:            ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:163:12: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:166:11: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx'
lwe-arithmetics_avx2.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:avx2clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


lwe-arithmetics_avx2.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
lwe-arithmetics_avx2.c:                  from lwe-arithmetics_avx2.c:10:
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c: In function 'submat_negmul':
lwe-arithmetics_avx2.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:808:1: error: inlining failed in call to 'always_inline' '_mm256_sub_epi16': target specific option mismatch
lwe-arithmetics_avx2.c:   808 | _mm256_sub_epi16 (__m256i __A, __m256i __B)
lwe-arithmetics_avx2.c:       | ^~~~~~~~~~~~~~~~
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:184:14: note: called from here
lwe-arithmetics_avx2.c:   184 |       c[5] = _mm256_sub_epi16(c[5], b[5]);
lwe-arithmetics_avx2.c:       |              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
lwe-arithmetics_avx2.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
lwe-arithmetics_avx2.c:                  from lwe-arithmetics_avx2.c:10:
lwe-arithmetics_avx2.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:808:1: error: inlining failed in call to 'always_inline' '_mm256_sub_epi16': target specific option mismatch
lwe-arithmetics_avx2.c:   808 | _mm256_sub_epi16 (__m256i __A, __m256i __B)
lwe-arithmetics_avx2.c:       | ^~~~~~~~~~~~~~~~
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:183:14: note: called from here
lwe-arithmetics_avx2.c:   183 |       c[4] = _mm256_sub_epi16(c[4], b[4]);
lwe-arithmetics_avx2.c:       |              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
lwe-arithmetics_avx2.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
lwe-arithmetics_avx2.c:                  from lwe-arithmetics_avx2.c:10:
lwe-arithmetics_avx2.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:808:1: error: inlining failed in call to 'always_inline' '_mm256_sub_epi16': target specific option mismatch
lwe-arithmetics_avx2.c:   808 | _mm256_sub_epi16 (__m256i __A, __m256i __B)
lwe-arithmetics_avx2.c:       | ^~~~~~~~~~~~~~~~
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:182:14: note: called from here
lwe-arithmetics_avx2.c:   182 |       c[3] = _mm256_sub_epi16(c[3], b[3]);
lwe-arithmetics_avx2.c:       |              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
lwe-arithmetics_avx2.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:avx2gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Namespace violations


cpa-pke_opt.o lotus_cpa_pke_dec_packed T
cpa-pke_opt.o lotus_cpa_pke_enc_packed T
cpa-pke_opt.o lotus_cpa_pke_keypair T
crypto.o crypto_symenc_decrypt T
crypto.o crypto_symenc_encrypt T
crypto.o crypto_symenc_keysetup T
crypto.o crypto_symenc_keystream T
crypto.o crypto_symenc_keystream_13block T
crypto.o crypto_symenc_keystream_32block T
encrypt.o util_cmp_const T
lwe-arithmetics_opt.o add_sigma T
lwe-arithmetics_opt.o distribute_2x2_nl T
lwe-arithmetics_opt.o distribute_2x2_nn T
lwe-arithmetics_opt.o merge_2x2_nl T
lwe-arithmetics_opt.o reconstruct T
lwe-arithmetics_opt.o redc T
lwe-arithmetics_opt.o submat_add_nl T
lwe-arithmetics_opt.o submat_add_nn T
lwe-arithmetics_opt.o submat_negmul T
lwe-arithmetics_opt.o submat_negsubmul T
lwe-arithmetics_opt.o submat_sub_nl T
lwe-arithmetics_opt.o submat_sub_nn T
lwe-arithmetics_opt.o submat_submul T
lwe-arithmetics_opt.o submul T
pack.o pack_128dg T
pack.o pack_128elems T
pack.o pack_64elems T
pack.o pack_ct T
pack.o pack_pk T
pack.o pack_sk T
pack.o unpack_128dg T
pack.o unpack_128elems T
pack.o unpack_64elems T
pack.o unpack_ct T
pack.o unpack_pk T
pack.o unpack_sk T
sampler.o _LOTUS_KYDG_SAMPLER_L1_pMat R
sampler.o _LOTUS_KYDG_SAMPLER_L1_weight R
sampler.o _LOTUS_KYDG_SAMPLER_LUT R
sampler.o csprng_sample_bit T
sampler.o csprng_sample_byte T
sampler.o extend_sign_with_random_bit T
sampler.o sample_discrete_gaussian T
sampler.o sample_uniform T
sampler.o sample_unit_discrete_gaussian T
sampler.o sampler_init T
sampler.o sampler_set_seed T
sampler.o scan_bit_and_output T

Number of similar (implementation,compiler) pairs: 9, namely:
ImplementationCompiler
T:optclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:optclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:optclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:optclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:optclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:optgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:optgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:optgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:optgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Namespace violations


cpa-pke.o lotus_cpa_pke_dec T
cpa-pke.o lotus_cpa_pke_enc T
cpa-pke.o lotus_cpa_pke_keypair T
crypto.o crypto_symenc_decrypt T
crypto.o crypto_symenc_encrypt T
crypto.o crypto_symenc_keysetup T
crypto.o crypto_symenc_keystream T
encrypt.o util_cmp_const T
lwe-arithmetics.o add_sigma T
lwe-arithmetics.o addmul T
lwe-arithmetics.o addmul_concat T
lwe-arithmetics.o reconstruct T
lwe-arithmetics.o redc T
lwe-arithmetics.o submul T
pack.o pack_128dg T
pack.o pack_128elems T
pack.o pack_64elems T
pack.o pack_ct T
pack.o pack_pk T
pack.o pack_sk T
pack.o unpack_128dg T
pack.o unpack_128elems T
pack.o unpack_64elems T
pack.o unpack_ct T
pack.o unpack_pk T
pack.o unpack_sk T
sampler.o _LOTUS_KYDG_SAMPLER_L1_pMat R
sampler.o _LOTUS_KYDG_SAMPLER_L1_weight R
sampler.o _LOTUS_KYDG_SAMPLER_LUT R
sampler.o csprng_sample_bit T
sampler.o csprng_sample_byte T
sampler.o extend_sign_with_random_bit T
sampler.o sample_discrete_gaussian T
sampler.o sample_uniform T
sampler.o sample_unit_discrete_gaussian T
sampler.o sampler_init T
sampler.o sampler_set_seed T
sampler.o scan_bit_and_output T

Number of similar (implementation,compiler) pairs: 9, namely:
ImplementationCompiler
T:refclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:refclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:refclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:refclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:refclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:refgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:refgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:refgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:refgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)