Implementation notes: amd64, rome0, crypto_sign/luov863256

Computer: rome0
Microarchitecture: amd64; Zen 2 (830f10)
Architecture: amd64
CPU ID: AuthenticAMD-00830f10-178bfbff
SUPERCOP version: 20240625
Operation: crypto_sign
Primitive: luov863256
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
851783634359 32768 065339 33588 1784T:avx2gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
854103860445 0 089589 852 1752T:avx2clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
854109063808 0 093357 852 1752T:avx2clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
856132820181 32768 049491 33588 1784T:avx2gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
856510348621 0 077005 836 1720T:avx2clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
866169418901 32768 047811 33588 1784T:avx2gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
883971114152 32768 042011 33580 1752T:avx2gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
913838349667 0 078069 836 1720T:avx2clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
12648198551542 0 049788 852 1752T:portableclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
12668031553230 0 050668 852 1752T:portableclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
13144585554845 0 051820 852 1720T:portableclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
13286490555678 0 052938 812 1784T:portablegcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
13866627547279 0 044764 836 1720T:portableclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
13944598550276 0 048066 812 1784T:portablegcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
14737837548248 0 045922 812 1784T:portablegcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
15716331546013 0 043698 804 1752T:portablegcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
20927991548361 0 045740 836 1720T:portableclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
107642968414980 36 0174234 812 1784T:refgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
108033074410874 36 0171306 812 1784T:refgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
108260148416397 36 0174716 852 1720T:refclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
109166788409645 36 0169866 812 1784T:refgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
109342826411259 36 0172036 852 1752T:refclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
109390203408180 36 0168164 836 1720T:refclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
109928081414606 36 0173620 852 1752T:refclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
113908230409223 36 0169260 836 1720T:refclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625
118921906407409 36 0167762 804 1752T:refgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062520240625

Compiler output


LUOV.c: LUOV.c:1156:10: warning: unused function 'read_uint64_t' [-Wunused-function]
LUOV.c: uint64_t read_uint64_t(const unsigned char *data){
LUOV.c:          ^
LUOV.c: 1 warning generated.
keccakrng.c: keccakrng.c:71:24: warning: unused function 'rotl' [-Wunused-function]
keccakrng.c: static inline uint64_t rotl(const uint64_t x, int k) {
keccakrng.c:                        ^
keccakrng.c: 1 warning generated.

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:avx2clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:avx2clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:avx2clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:avx2clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

Compiler output


LUOV.c: LUOV.c:38:19: error: '__builtin_ia32_permdi256' needs target feature avx2
LUOV.c:                         __m256i rrrr = _mm256_permute4x64_epi64(_mm256_loadu_si256((__m256i *)&Q1[col++]),0);
LUOV.c:                                        ^
LUOV.c: /usr/lib/llvm-11/lib/clang/11.0.1/include/avx2intrin.h:818:12: note: expanded from macro '_mm256_permute4x64_epi64'
LUOV.c:   (__m256i)__builtin_ia32_permdi256((__v4di)(__m256i)(V), (int)(M))
LUOV.c:            ^
LUOV.c: LUOV.c:38:44: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c:                         __m256i rrrr = _mm256_permute4x64_epi64(_mm256_loadu_si256((__m256i *)&Q1[col++]),0);
LUOV.c:                                                                 ^
LUOV.c: LUOV.c:38:44: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:43:69: error: always_inline function '_mm256_setzero_pd' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c:                                 *((__m256i *)&TempMat[i][k*8+4]) ^=  (__m256i) _mm256_blendv_pd(_mm256_setzero_pd(),(__m256d) rrrr,(__m256d)TJ);
LUOV.c:                                                                                                 ^
LUOV.c: LUOV.c:43:69: error: AVX vector return of type '__m256d' (vector of 4 'double' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:43:52: error: always_inline function '_mm256_blendv_pd' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c:                                 *((__m256i *)&TempMat[i][k*8+4]) ^=  (__m256i) _mm256_blendv_pd(_mm256_setzero_pd(),(__m256d) rrrr,(__m256d)TJ);
LUOV.c:                                                                                ^
LUOV.c: LUOV.c:43:52: error: AVX vector argument of type '__m256d' (vector of 4 'double' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:44:10: error: always_inline function '_mm256_slli_epi64' requires target feature 'avx2', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx2'
LUOV.c:                                 TJ = _mm256_slli_epi64(TJ,4);
LUOV.c:                                      ^
LUOV.c: LUOV.c:44:10: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:46:66: error: always_inline function '_mm256_setzero_pd' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c:                                 *((__m256i *)&TempMat[i][k*8]) ^= (__m256i) _mm256_blendv_pd(_mm256_setzero_pd(),(__m256d) rrrr,(__m256d)TJ);
LUOV.c:                                                                                              ^
LUOV.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:avx2clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

Compiler output


F64Field.c: F64Field.c: In function 'f64addInPlace':
F64Field.c: F64Field.c:43:4: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
F64Field.c:    43 |  *((uint64_t *) a->coef) ^= *((uint64_t *) b->coef);
F64Field.c:       |   ~^~~~~~~~~~~~~~~~~~~~~
F64Field.c: F64Field.c:43:31: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
F64Field.c:    43 |  *((uint64_t *) a->coef) ^= *((uint64_t *) b->coef);
F64Field.c:       |                              ~^~~~~~~~~~~~~~~~~~~~~
F80Field.c: F80Field.c: In function 'f80addInPlace':
F80Field.c: F80Field.c:55:4: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
F80Field.c:    55 |  *((uint64_t *) a->coef) ^= *((uint64_t *) b->coef);
F80Field.c:       |   ~^~~~~~~~~~~~~~~~~~~~~
F80Field.c: F80Field.c:55:31: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
F80Field.c:    55 |  *((uint64_t *) a->coef) ^= *((uint64_t *) b->coef);
F80Field.c:       |                              ~^~~~~~~~~~~~~~~~~~~~~

Number of similar (implementation,compiler) pairs: 3, namely:
ImplementationCompiler
T:portablegcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:portablegcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:portablegcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)