Test results for amd64, hydra8, crypto_sign/haetae3

[Page version: 20260312 23:03:09]

Measurements for amd64, hydra8, crypto_sign Test results for amd64, hydra8, crypto_sign Test results for crypto_sign/haetae3

Computer: hydra8
Microarchitecture: amd64; Ivy Bridge+AES (306a9)
Architecture: amd64
CPU ID: GenuineIntel-000306a9-bfebfbff
SUPERCOP version: 20260217
Operation: crypto_sign
Primitive: haetae3

Time	Object size	Test size	Implementation	Compiler	Benchmark date	SUPERCOP version
2448687	142287 0 0	172633 1288 2312	`ref`	`clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall`	20260305	20260217
2488071	103897 0 0	134369 1288 2312	`ref`	`clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall`	20260305	20260217
2491810	64687 0 0	94129 1288 2312	`ref`	`clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall`	20260305	20260217
2561942	60280 0 0	90420 1240 2408	`ref`	`gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall`	20260305	20260217
2699114	35553 0 0	62683 1280 2312	`ref`	`clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall`	20260305	20260217
2732370	39868 560 0	68220 1808 2408	`ref`	`gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall`	20260305	20260217
2745273	40398 0 0	68153 1288 2312	`ref`	`clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall`	20260305	20260217
2983036	37322 560 0	65196 1808 2408	`ref`	`gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall`	20260305	20260217
3033212	35785 752 0	62422 1984 2376	`ref`	`gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall`	20260305	20260217

Compiler output

fft.c: fft.c:230:35: error: always_inline function '_mm256_add_epi32' requires target feature 'avx2', but would be inlined into function 'complex_fp_sqabs_add' that is compiled without support for 'avx2'
fft.c:   230 |     *res = _mm256_add_epi32(*res, _mm256_add_epi32(_mulrnd16_avx(*real, *real), _mulrnd16_avx(*imag, *imag)));
fft.c:       |                                   ^
fft.c: fft.c:230:12: error: always_inline function '_mm256_add_epi32' requires target feature 'avx2', but would be inlined into function 'complex_fp_sqabs_add' that is compiled without support for 'avx2'
fft.c:   230 |     *res = _mm256_add_epi32(*res, _mm256_add_epi32(_mulrnd16_avx(*real, *real), _mulrnd16_avx(*imag, *imag)));
fft.c:       |            ^
fft.c: 2 errors generated.

Number of similar (implementation,compiler) pairs: 4, namely:

Implementation	Compiler
`avx2`	`clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`avx2`	`clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`avx2`	`clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`avx2`	`clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`

Compiler output

aes256ctr.c: aes256ctr.c:91:3: error: '__builtin_ia32_aeskeygenassist128' needs target feature aes
aes256ctr.c:    91 |   BLOCK1(0x01);
aes256ctr.c:       |   ^
aes256ctr.c: aes256ctr.c:72:11: note: expanded from macro 'BLOCK1'
aes256ctr.c:    72 |   temp1 = _mm_aeskeygenassist_si128(temp2, IMM);                        \
aes256ctr.c:       |           ^
aes256ctr.c: /usr/lib/llvm-18/lib/clang/18/include/__wmmintrin_aes.h:136:13: note: expanded from macro '_mm_aeskeygenassist_si128'
aes256ctr.c:   136 |   ((__m128i)__builtin_ia32_aeskeygenassist128((__v2di)(__m128i)(C), (int)(R)))
aes256ctr.c:       |             ^
aes256ctr.c: aes256ctr.c:92:3: error: '__builtin_ia32_aeskeygenassist128' needs target feature aes
aes256ctr.c:    92 |   BLOCK2(0x01);
aes256ctr.c:       |   ^
aes256ctr.c: aes256ctr.c:82:11: note: expanded from macro 'BLOCK2'
aes256ctr.c:    82 |   temp1 = _mm_aeskeygenassist_si128(temp0, IMM);                        \
aes256ctr.c:       |           ^
aes256ctr.c: /usr/lib/llvm-18/lib/clang/18/include/__wmmintrin_aes.h:136:13: note: expanded from macro '_mm_aeskeygenassist_si128'
aes256ctr.c:   136 |   ((__m128i)__builtin_ia32_aeskeygenassist128((__v2di)(__m128i)(C), (int)(R)))
aes256ctr.c:       |             ^
aes256ctr.c: aes256ctr.c:94:3: error: '__builtin_ia32_aeskeygenassist128' needs target feature aes
aes256ctr.c:    94 |   BLOCK1(0x02);
aes256ctr.c:       |   ^
aes256ctr.c: aes256ctr.c:72:11: note: expanded from macro 'BLOCK1'
aes256ctr.c:    72 |   temp1 = _mm_aeskeygenassist_si128(temp2, IMM);                        \
aes256ctr.c:       |           ^
aes256ctr.c: /usr/lib/llvm-18/lib/clang/18/include/__wmmintrin_aes.h:136:13: note: expanded from macro '_mm_aeskeygenassist_si128'
aes256ctr.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:

Implementation	Compiler
`avx2`	`clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`

Compiler output

fft.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/13/include/immintrin.h:51,
fft.c:                  from align.h:5,
fft.c:                  from poly.h:6,
fft.c:                  from fft.h:4,
fft.c:                  from fft.c:1:
fft.c: /usr/lib/gcc/x86_64-linux-gnu/13/include/avx2intrin.h: In function '_mulrnd16_avx':
fft.c: /usr/lib/gcc/x86_64-linux-gnu/13/include/avx2intrin.h:974:1: error: inlining failed in call to 'always_inline' '_mm256_blend_epi32': target specific option mismatch
fft.c:   974 | _mm256_blend_epi32 (__m256i __X, __m256i __Y, const int __M)
fft.c:       | ^~~~~~~~~~~~~~~~~~
fft.c: fft.c:168:10: note: called from here
fft.c:   168 |     rl = _mm256_blend_epi32(rl, rh, 0xaa);
fft.c:       |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fft.c: /usr/lib/gcc/x86_64-linux-gnu/13/include/avx2intrin.h:696:1: error: inlining failed in call to 'always_inline' '_mm256_slli_epi64': target specific option mismatch
fft.c:   696 | _mm256_slli_epi64 (__m256i __A, int __B)
fft.c:       | ^~~~~~~~~~~~~~~~~
fft.c: fft.c:167:10: note: called from here
fft.c:   167 |     rh = _mm256_slli_epi64(rh, 16); // shift up
fft.c:       |          ^~~~~~~~~~~~~~~~~~~~~~~~~
fft.c: /usr/lib/gcc/x86_64-linux-gnu/13/include/avx2intrin.h:787:1: error: inlining failed in call to 'always_inline' '_mm256_srli_epi64': target specific option mismatch
fft.c:   787 | _mm256_srli_epi64 (__m256i __A, int __B)
fft.c:       | ^~~~~~~~~~~~~~~~~
fft.c: fft.c:166:10: note: called from here
fft.c:   166 |     rl = _mm256_srli_epi64(rl, 16);
fft.c:       |          ^~~~~~~~~~~~~~~~~~~~~~~~~
fft.c: /usr/lib/gcc/x86_64-linux-gnu/13/include/avx2intrin.h:126:1: error: inlining failed in call to 'always_inline' '_mm256_add_epi64': target specific option mismatch
fft.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:

Implementation	Compiler
`avx2`	`gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)`
`avx2`	`gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)`
`avx2`	`gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)`
`avx2`	`gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)`

Compiler output

sampler.c: sampler.c:230:43: warning: variable 'cnt' set but not used [-Wunused-but-set-variable]
sampler.c:   230 |     size_t bytecnt = buflen, coefcnt = 0, cnt = 0;
sampler.c:       |                                           ^
sampler.c: 1 warning generated.

Number of similar (implementation,compiler) pairs: 5, namely:

Implementation	Compiler
`ref`	`clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`ref`	`clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`ref`	`clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`ref`	`clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`ref`	`clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`

Namespace violations

decompose.o cryptolab_haetae3_decompose_hint T
decompose.o cryptolab_haetae3_decompose_vk T
decompose.o cryptolab_haetae3_decompose_z1 T
encoding.o cryptolab_haetae3_decode_h T
encoding.o cryptolab_haetae3_decode_hb_z1 T
encoding.o cryptolab_haetae3_encode_h T
encoding.o cryptolab_haetae3_encode_hb_z1 T
fft.o brv8 R
fft.o complex_fp_sqabs T
fft.o fft T
fft.o fft_init_and_bitrev T
fips202.o haetae_fips202_KeccakF_RoundConstants R
fips202.o haetae_fips202_sha3_256 T
fips202.o haetae_fips202_sha3_512 T
fips202.o haetae_fips202_shake128 T
fips202.o haetae_fips202_shake128_absorb T
fips202.o haetae_fips202_shake128_absorb_once T
fips202.o haetae_fips202_shake128_finalize T
fips202.o haetae_fips202_shake128_init T
fips202.o haetae_fips202_shake128_squeeze T
fips202.o haetae_fips202_shake128_squeezeblocks T
fips202.o haetae_fips202_shake256 T
fips202.o haetae_fips202_shake256_absorb T
fips202.o haetae_fips202_shake256_absorb_once T
fips202.o haetae_fips202_shake256_finalize T
fips202.o haetae_fips202_shake256_init T
fips202.o haetae_fips202_shake256_squeeze T
fips202.o haetae_fips202_shake256_squeezeblocks T
fixpoint.o cryptolab_haetae3_fixpoint_add T
fixpoint.o cryptolab_haetae3_fixpoint_mul_rnd13 T
fixpoint.o cryptolab_haetae3_fixpoint_newton_invsqrt T
fixpoint.o cryptolab_haetae3_fixpoint_square T
fixpoint.o start_cube R
fixpoint.o start_times_threehalves R
ntt.o cryptolab_haetae3_invntt_tomont T
ntt.o cryptolab_haetae3_ntt T
packing.o cryptolab_haetae3_pack_pk T
packing.o cryptolab_haetae3_pack_sig T
packing.o cryptolab_haetae3_pack_sk T
packing.o cryptolab_haetae3_unpack_pk T
packing.o cryptolab_haetae3_unpack_sig T
packing.o cryptolab_haetae3_unpack_sk T
poly.o cryptolab_haetae3_poly2eta_pack T
poly.o cryptolab_haetae3_poly2eta_unpack T
poly.o cryptolab_haetae3_poly_add T
poly.o cryptolab_haetae3_poly_challenge T
poly.o cryptolab_haetae3_poly_compose T
poly.o cryptolab_haetae3_poly_decomposed_pack T
poly.o cryptolab_haetae3_poly_decomposed_unpack T
poly.o cryptolab_haetae3_poly_freeze T
poly.o cryptolab_haetae3_poly_freeze2q T
poly.o cryptolab_haetae3_poly_fromcrt T
poly.o cryptolab_haetae3_poly_fromcrt0 T
poly.o cryptolab_haetae3_poly_highbits T
poly.o cryptolab_haetae3_poly_invntt_tomont T
poly.o cryptolab_haetae3_poly_lowbits T
poly.o cryptolab_haetae3_poly_lsb T
poly.o cryptolab_haetae3_poly_ntt T
poly.o cryptolab_haetae3_poly_pack_highbits T
poly.o cryptolab_haetae3_poly_pack_lsb T
poly.o cryptolab_haetae3_poly_pointwise_montgomery T
poly.o cryptolab_haetae3_poly_reduce2q T
poly.o cryptolab_haetae3_poly_sub T
poly.o cryptolab_haetae3_poly_uniform T
poly.o cryptolab_haetae3_poly_uniform_eta T
poly.o cryptolab_haetae3_polyeta_pack T
poly.o cryptolab_haetae3_polyeta_unpack T
poly.o cryptolab_haetae3_polyq_pack T
poly.o cryptolab_haetae3_polyq_unpack T
poly.o hammingWeight_8 T
polyfix.o cryptolab_haetae3_polyfix_add T
polyfix.o cryptolab_haetae3_polyfix_round T
polyfix.o cryptolab_haetae3_polyfixfixveck_sub T
polyfix.o cryptolab_haetae3_polyfixfixvecl_sub T
polyfix.o cryptolab_haetae3_polyfixveck_add T
polyfix.o cryptolab_haetae3_polyfixveck_double T
polyfix.o cryptolab_haetae3_polyfixveck_round T
polyfix.o cryptolab_haetae3_polyfixvecl_add T
polyfix.o cryptolab_haetae3_polyfixvecl_double T
polyfix.o cryptolab_haetae3_polyfixvecl_round T
polyfix.o cryptolab_haetae3_polyfixveclk_sample_hyperball T
polyfix.o cryptolab_haetae3_polyfixveclk_sqnorm2 T
polyfix.o fix_round T
polyfix.o polyfixfix_sub T
polymat.o cryptolab_haetae3_polymatkl_double T
polymat.o cryptolab_haetae3_polymatkl_expand T
polymat.o cryptolab_haetae3_polymatkl_pointwise_montgomery T
polymat.o cryptolab_haetae3_polymatkm_expand T
polymat.o cryptolab_haetae3_polymatkm_pointwise_montgomery T
polyvec.o cryptolab_haetae3_polyveck_add T
polyvec.o cryptolab_haetae3_polyveck_caddDQ2ALPHA T
polyvec.o cryptolab_haetae3_polyveck_caddq T
polyvec.o cryptolab_haetae3_polyveck_cneg T
polyvec.o cryptolab_haetae3_polyveck_csubDQ2ALPHA T
polyvec.o cryptolab_haetae3_polyveck_decompose_vk T
polyvec.o cryptolab_haetae3_polyveck_div2 T
polyvec.o cryptolab_haetae3_polyveck_double T
polyvec.o cryptolab_haetae3_polyveck_double_negate T
polyvec.o cryptolab_haetae3_polyveck_expand T
polyvec.o cryptolab_haetae3_polyveck_freeze T
polyvec.o cryptolab_haetae3_polyveck_freeze2q T
polyvec.o cryptolab_haetae3_polyveck_frommont T
polyvec.o cryptolab_haetae3_polyveck_highbits_hint T
polyvec.o cryptolab_haetae3_polyveck_invntt_tomont T
polyvec.o cryptolab_haetae3_polyveck_mul_alpha T
polyvec.o cryptolab_haetae3_polyveck_ntt T
polyvec.o cryptolab_haetae3_polyveck_pack_highbits T
polyvec.o cryptolab_haetae3_polyveck_poly_fromcrt T
polyvec.o cryptolab_haetae3_polyveck_poly_pointwise_montgomery T
polyvec.o cryptolab_haetae3_polyveck_reduce2q T
polyvec.o cryptolab_haetae3_polyveck_sqnorm2 T
polyvec.o cryptolab_haetae3_polyveck_sub T
polyvec.o cryptolab_haetae3_polyvecl_cneg T
polyvec.o cryptolab_haetae3_polyvecl_highbits T
polyvec.o cryptolab_haetae3_polyvecl_lowbits T
polyvec.o cryptolab_haetae3_polyvecl_ntt T
polyvec.o cryptolab_haetae3_polyvecl_pointwise_acc_montgomery T
polyvec.o cryptolab_haetae3_polyvecl_sqnorm2 T
polyvec.o cryptolab_haetae3_polyvecm_ntt T
polyvec.o cryptolab_haetae3_polyvecm_pointwise_acc_montgomery T
polyvec.o cryptolab_haetae3_polyvecmk_sqsing_value T
polyvec.o cryptolab_haetae3_polyvecmk_uniform_eta T
reduce.o cryptolab_haetae3_caddq T
reduce.o cryptolab_haetae3_freeze T
reduce.o cryptolab_haetae3_freeze2q T
reduce.o cryptolab_haetae3_montgomery_reduce T
reduce.o cryptolab_haetae3_reduce32_2q T
sampler.o cryptolab_haetae3_rej_eta T
sampler.o cryptolab_haetae3_rej_uniform T
sampler.o cryptolab_haetae3_sample_gauss_N T
sampler.o sample_gauss T
sign.o cryptolab_haetae3_signature T
sign.o cryptolab_haetae3_verify T
symmetric-shake.o cryptolab_haetae3_haetae_shake128_stream_init T
symmetric-shake.o cryptolab_haetae3_haetae_shake256_absorb_twice T
symmetric-shake.o cryptolab_haetae3_haetae_shake256_stream_init T

Number of similar (implementation,compiler) pairs: 9, namely:

Implementation	Compiler
`ref`	`clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`ref`	`clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`ref`	`clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`ref`	`clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`ref`	`clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`ref`	`gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)`
`ref`	`gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)`
`ref`	`gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)`
`ref`	`gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)`

Passed TIMECOP

TIMECOP iterations: 10

Number of similar (implementation,compiler) pairs: 9, namely:

Implementation	Compiler
`ref`	`clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`ref`	`clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`ref`	`clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`ref`	`clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`ref`	`clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))`
`ref`	`gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)`
`ref`	`gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)`
`ref`	`gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)`
`ref`	`gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)`