Implementation notes: amd64, comet, crypto_sign/luov890351pc

Computer: comet
Microarchitecture: amd64; Comet Lake (806ec)
Architecture: amd64
CPU ID: GenuineIntel-000806ec-bfebfbff
SUPERCOP version: 20240625
Operation: crypto_sign
Primitive: luov890351pc
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
148495064721 0 096118 892 1792T:avx2clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024070520240625
151136459889 0 090582 892 1760T:avx2clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024070520240625
182497550870 0 080230 876 1792T:avx2clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024070520240625
184034354664 0 082102 892 1728T:avx2clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024070520240625
204245720801 32768 049607 33588 1792T:avx2gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070520240625

Test failure


error 111
crypto_sign_open returns nonzero

Number of similar (implementation,compiler) pairs: 2, namely:
ImplementationCompiler
T:avx2gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0)
T:avx2gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0)

Test failure


error 111

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:avx2gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0)

Compiler output


keccakrng.c: keccakrng.c:71:24: warning: unused function 'rotl' [-Wunused-function]
keccakrng.c: static inline uint64_t rotl(const uint64_t x, int k) {
keccakrng.c:                        ^
keccakrng.c: 1 warning generated.

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:avx2clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_14.0.6)
T:avx2clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_14.0.6)
T:avx2clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_14.0.6)
T:avx2clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_14.0.6)

Compiler output


LUOV.c: LUOV.c:110:17: error: '__builtin_ia32_permti256' needs target feature avx2
LUOV.c:                         __m256i rr = _mm256_permute2x128_si256(_mm256_loadu_si256((__m256i *)&r),_mm256_setzero_si256(),0);
LUOV.c:                                      ^
LUOV.c: /usr/lib/llvm-14/lib/clang/14.0.6/include/avx2intrin.h:821:13: note: expanded from macro '_mm256_permute2x128_si256'
LUOV.c:   ((__m256i)__builtin_ia32_permti256((__m256i)(V1), (__m256i)(V2), (int)(M)))
LUOV.c:             ^
LUOV.c: LUOV.c:110:43: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c:                         __m256i rr = _mm256_permute2x128_si256(_mm256_loadu_si256((__m256i *)&r),_mm256_setzero_si256(),0);
LUOV.c:                                                                ^
LUOV.c: LUOV.c:110:43: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:110:77: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c:                         __m256i rr = _mm256_permute2x128_si256(_mm256_loadu_si256((__m256i *)&r),_mm256_setzero_si256(),0);
LUOV.c:                                                                                                  ^
LUOV.c: LUOV.c:110:77: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:115:20: error: always_inline function '_mm256_set1_epi8' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c:                                 __m256i tttt = _mm256_set1_epi8(t[k/8]);
LUOV.c:                                                ^
LUOV.c: LUOV.c:115:20: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:117:54: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c:                                 __m256i t1t2 = _mm256_cmpeq_epi8(tttt & masks[0],_mm256_setzero_si256());
LUOV.c:                                                                                  ^
LUOV.c: LUOV.c:117:54: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:117:20: error: always_inline function '_mm256_cmpeq_epi8' requires target feature 'avx2', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx2'
LUOV.c:                                 __m256i t1t2 = _mm256_cmpeq_epi8(tttt & masks[0],_mm256_setzero_si256());
LUOV.c:                                                ^
LUOV.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:avx2clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_14.0.6)

Compiler output


LUOV.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/12/include/immintrin.h:43,
LUOV.c:                  from LUOV.h:7,
LUOV.c:                  from LUOV.c:1:
LUOV.c: In function '_mm256_loadu_si256',
LUOV.c:     inlined from 'calculateQ2' at LUOV.c:142:17:
LUOV.c: /usr/lib/gcc/x86_64-linux-gnu/12/include/avxintrin.h:929:10: warning: array subscript '__m256i_u[0]' is partly outside array bounds of '__m128i[1]' [-Warray-bounds]
LUOV.c:   929 |   return *__P;
LUOV.c:       |          ^~~~
LUOV.c: LUOV.c: In function 'calculateQ2':
LUOV.c: LUOV.c:141:38: note: object 'r' of size 16
LUOV.c:   141 |                         bitcontainer r = TempMat[j][i];
LUOV.c:       |                                      ^
LUOV.c: In function '_mm256_loadu_si256',
LUOV.c:     inlined from 'calculateQ2' at LUOV.c:110:17:
LUOV.c: /usr/lib/gcc/x86_64-linux-gnu/12/include/avxintrin.h:929:10: warning: array subscript '__m256i_u[0]' is partly outside array bounds of '__m128i[1]' [-Warray-bounds]
LUOV.c:   929 |   return *__P;
LUOV.c:       |          ^~~~
LUOV.c: LUOV.c: In function 'calculateQ2':
LUOV.c: LUOV.c:109:38: note: object 'r' of size 16
LUOV.c:   109 |                         bitcontainer r = _mm_loadu_si128(&Q1[col++]);
LUOV.c:       |                                      ^
LUOV.c: In function '_mm256_loadu_si256',
LUOV.c:     inlined from 'TransformQ1' at LUOV.c:285:17:
LUOV.c: /usr/lib/gcc/x86_64-linux-gnu/12/include/avxintrin.h:929:10: warning: array subscript '__m256i_u[0]' is partly outside array bounds of '__m128i[1]' [-Warray-bounds]
LUOV.c:   929 |   return *__P;
LUOV.c: ...

Number of similar (implementation,compiler) pairs: 3, namely:
ImplementationCompiler
T:avx2gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0)
T:avx2gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0)
T:avx2gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0)