Implementation notes: amd64, jasper, crypto_sign/luov890351

Computer: jasper
Microarchitecture: amd64; Tremont (906c0)
Architecture: amd64
CPU ID: GenuineIntel-000906c0-20-bfebfbff
SUPERCOP version: 20240808
Operation: crypto_sign
Primitive: luov890351
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
177563522555702 0 0115036 852 1720T:portableclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
177689213555510 0 0117740 852 1720T:portableclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
178449003555980 0 0116980 852 1720T:portableclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
178730338558199 0 0119002 812 1784T:portablegcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
180353759549589 0 0110844 836 1720T:portableclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
181107658553783 0 0115418 812 1784T:portablegcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
184310422552290 0 0113946 812 1784T:portablegcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
185087744547822 0 0109474 804 1752T:portablegcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
195173620550645 0 0112172 836 1720T:portableclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
575710245416481 36 0240076 852 1720T:refclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
576828938415885 36 0236732 852 1720T:refclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
579198184416638 36 0239002 812 1784T:refgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
580291640412454 36 0236618 812 1784T:refgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
584445208416694 36 0239012 852 1720T:refclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
590344409410844 36 0235090 812 1784T:refgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
629520386409477 36 0233316 836 1720T:refclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
636272161410393 36 0234540 836 1720T:refclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716
638994170408455 36 0232730 804 1752T:refgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072220240716

Compiler output


LUOV.c: LUOV.c:110:17: error: '__builtin_ia32_permti256' needs target feature avx2
LUOV.c:                         __m256i rr = _mm256_permute2x128_si256(_mm256_loadu_si256((__m256i *)&r),_mm256_setzero_si256(),0);
LUOV.c:                                      ^
LUOV.c: /usr/lib/llvm-11/lib/clang/11.0.1/include/avx2intrin.h:821:12: note: expanded from macro '_mm256_permute2x128_si256'
LUOV.c:   (__m256i)__builtin_ia32_permti256((__m256i)(V1), (__m256i)(V2), (int)(M))
LUOV.c:            ^
LUOV.c: LUOV.c:110:43: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c:                         __m256i rr = _mm256_permute2x128_si256(_mm256_loadu_si256((__m256i *)&r),_mm256_setzero_si256(),0);
LUOV.c:                                                                ^
LUOV.c: LUOV.c:110:43: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:110:77: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c:                         __m256i rr = _mm256_permute2x128_si256(_mm256_loadu_si256((__m256i *)&r),_mm256_setzero_si256(),0);
LUOV.c:                                                                                                  ^
LUOV.c: LUOV.c:110:77: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:115:20: error: always_inline function '_mm256_set1_epi8' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c:                                 __m256i tttt = _mm256_set1_epi8(t[k/8]);
LUOV.c:                                                ^
LUOV.c: LUOV.c:115:20: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:117:54: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c:                                 __m256i t1t2 = _mm256_cmpeq_epi8(tttt & masks[0],_mm256_setzero_si256());
LUOV.c:                                                                                  ^
LUOV.c: LUOV.c:117:54: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:117:20: error: always_inline function '_mm256_cmpeq_epi8' requires target feature 'avx2', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx2'
LUOV.c:                                 __m256i t1t2 = _mm256_cmpeq_epi8(tttt & masks[0],_mm256_setzero_si256());
LUOV.c:                                                ^
LUOV.c: ...

Number of similar (implementation,compiler) pairs: 5, namely:
ImplementationCompiler
T:avx2clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:avx2clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:avx2clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:avx2clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:avx2clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

Compiler output


LUOV.c: LUOV.c: In function 'calculateQ2':
LUOV.c: LUOV.c:110:12: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
LUOV.c:   110 |    __m256i rr = _mm256_permute2x128_si256(_mm256_loadu_si256((__m256i *)&r),_mm256_setzero_si256(),0);
LUOV.c:       |            ^~
LUOV.c: In file included from LinearAlgebra.h:9,
LUOV.c:                  from LUOV.h:13,
LUOV.c:                  from LUOV.c:1:
LUOV.c: AVX_Operations.h: In function 'scalarMul_ct':
LUOV.c: AVX_Operations.h:529:6: note: the ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
LUOV.c:   529 | void scalarMul_ct(__m256i *Out, __m256i A, FELT b){
LUOV.c:       |      ^~~~~~~~~~~~
LUOV.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:53,
LUOV.c:                  from LUOV.h:7,
LUOV.c:                  from LUOV.c:1:
LUOV.c: AVX_Operations.h: In function 'addScalarProductAVX':
LUOV.c: /usr/lib/gcc/x86_64-linux-gnu/10/include/avx2intrin.h:186:1: error: inlining failed in call to 'always_inline' '_mm256_andnot_si256': target specific option mismatch
LUOV.c:   186 | _mm256_andnot_si256 (__m256i __A, __m256i __B)
LUOV.c:       | ^~~~~~~~~~~~~~~~~~~
LUOV.c: In file included from LinearAlgebra.h:9,
LUOV.c:                  from LUOV.h:13,
LUOV.c:                  from LUOV.c:1:
LUOV.c: AVX_Operations.h:80:9: note: called from here
LUOV.c:    80 |  avx3 = _mm256_andnot_si256(avx3,aa);
LUOV.c:       |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
LUOV.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:53,
LUOV.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:avx2gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:avx2gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:avx2gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:avx2gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

Compiler output


F64Field.c: F64Field.c: In function 'f64addInPlace':
F64Field.c: F64Field.c:43:4: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
F64Field.c:    43 |  *((uint64_t *) a->coef) ^= *((uint64_t *) b->coef);
F64Field.c:       |   ~^~~~~~~~~~~~~~~~~~~~~
F64Field.c: F64Field.c:43:31: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
F64Field.c:    43 |  *((uint64_t *) a->coef) ^= *((uint64_t *) b->coef);
F64Field.c:       |                              ~^~~~~~~~~~~~~~~~~~~~~
F80Field.c: F80Field.c: In function 'f80addInPlace':
F80Field.c: F80Field.c:55:4: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
F80Field.c:    55 |  *((uint64_t *) a->coef) ^= *((uint64_t *) b->coef);
F80Field.c:       |   ~^~~~~~~~~~~~~~~~~~~~~
F80Field.c: F80Field.c:55:31: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
F80Field.c:    55 |  *((uint64_t *) a->coef) ^= *((uint64_t *) b->coef);
F80Field.c:       |                              ~^~~~~~~~~~~~~~~~~~~~~

Number of similar (implementation,compiler) pairs: 3, namely:
ImplementationCompiler
T:portablegcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:portablegcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:portablegcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

Namespace violations


Bitcontainer.o deserialize_bitcontainer T
Bitcontainer.o flipBit T
Bitcontainer.o getBit T
Bitcontainer.o randomBitcontainer T
Bitcontainer.o serialize_bitcontainer T
Bitcontainer.o squeezeBitcontainerArray T
Bitcontainer.o xor T
F16Field.o f16EucildeanDivision T
F16Field.o f16ExtendedEuclideanAlgorithm T
F16Field.o f16antilog T
F16Field.o f16deserialize_FELT T
F16Field.o f16inverse T
F16Field.o f16log T
F16Field.o f16multiply T
F16Field.o f16polyAdd T
F16Field.o f16polyCopy T
F16Field.o f16polyMult T
F16Field.o f16polyOne T
F16Field.o f16polyZero T
F16Field.o f16printFELT T
F16Field.o f16scalarMultiply T
F16Field.o f16serialize_FELT T
F16Field.o isArrayOfZeros T
F32Field.o f32add T
F32Field.o f32addInPlace T
F32Field.o f32deserialize_FELT T
F32Field.o f32inverse T
F32Field.o f32isEqual T
F32Field.o f32multiply T
F32Field.o f32multiplyOld T
F32Field.o f32printFELT T
F32Field.o f32scalarMultiply T
F32Field.o f32serialize_FELT T
F32Field.o newF32FELT T
F48Field.o f48add T
F48Field.o f48addInPlace T
F48Field.o f48deserialize_FELT T
F48Field.o f48inverse T
F48Field.o f48isEqual T
F48Field.o f48multiply T
F48Field.o f48printFELT T
F48Field.o f48scalarMultiply T
F48Field.o f48serialize_FELT T
F48Field.o newF48FELT T
F64Field.o f64add T
F64Field.o f64addInPlace T
F64Field.o f64deserialize_FELT T
F64Field.o f64inverse T
F64Field.o f64isEqual T
F64Field.o f64multiply T
F64Field.o f64printFELT T
F64Field.o f64scalarMultiply T
F64Field.o f64serialize_FELT T
F64Field.o newF64FELT T
F80Field.o f80Scalarmultiply T
F80Field.o f80add T
F80Field.o f80addInPlace T
F80Field.o f80deserialize_FELT T
F80Field.o f80inverse T
F80Field.o f80isEqual T
F80Field.o f80multiply T
F80Field.o f80printFELT T
F80Field.o f80serialize_FELT T
F80Field.o newF80FELT T
F8Field.o f8antilog T
F8Field.o f8deserialize_FELT T
F8Field.o f8inverse T
F8Field.o f8log T
F8Field.o f8multiply T
F8Field.o f8printFELT T
F8Field.o f8serialize_FELT T
LUOV.o BuildAugmentedMatrix T
LUOV.o _addScalarProduct1 T
LUOV.o _addScalarProduct3 T
LUOV.o addScalarProduct T
LUOV.o addScalarProduct3 T
LUOV.o calculateQ2 T
LUOV.o computeTarget T
LUOV.o deserialize_PublicKey T
LUOV.o deserialize_SecretKey T
LUOV.o deserialize_signature T
LUOV.o destroy_PublicKey T
LUOV.o destroy_SecretKey T
LUOV.o destroy_signature T
LUOV.o evaluatePublicMap T
LUOV.o expandTable T
LUOV.o extractMessage T
LUOV.o generateKeyPair T
LUOV.o repeatTable T
LUOV.o serialize_PublicKey T
LUOV.o serialize_SecretKey T
LUOV.o serialize_signature T
LUOV.o signDocument T
LUOV.o solvePrivateUOVSystem T
LUOV.o verify T
LinearAlgebra.o destroy_matrix T
LinearAlgebra.o getUniqueSolution T
LinearAlgebra.o newMatrix T
LinearAlgebra.o printMatrix T
LinearAlgebra.o rowEchelonAugmented T
LinearAlgebra.o rowOp T
LinearAlgebra.o scaleRow T
LinearAlgebra.o swapRows T
LinearAlgebra.o zeroMatrix T
buffer.o deserialize_uint64_t T
buffer.o newReader T
buffer.o newWriter T
buffer.o readBit T
buffer.o serialize_uint64_t T
buffer.o transcribe T
buffer.o writeBit T
intermediateValues.o printAugmentedMatrix T
intermediateValues.o printEvaluation T
intermediateValues.o printPrivateSolution T
intermediateValues.o printVinegarValues T
intermediateValues.o reportSolutionFound T
keccakrng.o initializeAndAbsorb T
keccakrng.o squeezeVector T
keccakrng.o squeezeuint64_t T

Number of similar (implementation,compiler) pairs: 9, namely:
ImplementationCompiler
T:portableclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:portableclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:portableclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:portableclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:portableclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:portablegcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:portablegcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:portablegcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:portablegcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

Namespace violations


Bitcontainer.o deserialize_bitcontainer T
Bitcontainer.o flipBit T
Bitcontainer.o getBit T
Bitcontainer.o randomBitcontainer T
Bitcontainer.o serialize_bitcontainer T
Bitcontainer.o squeezeBitcontainerArray T
Bitcontainer.o xor T
F16Field.o f16EucildeanDivision T
F16Field.o f16ExtendedEuclideanAlgorithm T
F16Field.o f16antilog T
F16Field.o f16deserialize_FELT T
F16Field.o f16inverse T
F16Field.o f16log T
F16Field.o f16multiply T
F16Field.o f16polyAdd T
F16Field.o f16polyCopy T
F16Field.o f16polyMult T
F16Field.o f16polyOne T
F16Field.o f16polyZero T
F16Field.o f16printFELT T
F16Field.o f16scalarMultiply T
F16Field.o f16serialize_FELT T
F16Field.o isArrayOfZeros T
F32Field.o f32add T
F32Field.o f32deserialize_FELT T
F32Field.o f32inverse T
F32Field.o f32isEqual T
F32Field.o f32multiply T
F32Field.o f32multiplyOld T
F32Field.o f32printFELT T
F32Field.o f32scalarMultiply T
F32Field.o f32serialize_FELT T
F32Field.o newF32FELT T
F48Field.o f48add T
F48Field.o f48deserialize_FELT T
F48Field.o f48inverse T
F48Field.o f48isEqual T
F48Field.o f48multiply T
F48Field.o f48printFELT T
F48Field.o f48scalarMultiply T
F48Field.o f48serialize_FELT T
F48Field.o newF48FELT T
F64Field.o f64add T
F64Field.o f64deserialize_FELT T
F64Field.o f64inverse T
F64Field.o f64isEqual T
F64Field.o f64multiply T
F64Field.o f64printFELT T
F64Field.o f64scalarMultiply T
F64Field.o f64serialize_FELT T
F64Field.o newF64FELT T
F80Field.o f80Scalarmultiply T
F80Field.o f80add T
F80Field.o f80deserialize_FELT T
F80Field.o f80inverse T
F80Field.o f80isEqual T
F80Field.o f80multiply T
F80Field.o f80printFELT T
F80Field.o f80serialize_FELT T
F80Field.o newF80FELT T
F8Field.o f8antilog T
F8Field.o f8deserialize_FELT T
F8Field.o f8inverse T
F8Field.o f8log T
F8Field.o f8multiply T
F8Field.o f8printFELT T
F8Field.o f8serialize_FELT T
LUOV.o BuildAugmentedMatrix T
LUOV.o calculateQ2 T
LUOV.o computeTarget T
LUOV.o deserialize_PublicKey T
LUOV.o deserialize_SecretKey T
LUOV.o deserialize_signature T
LUOV.o destroy_PublicKey T
LUOV.o destroy_SecretKey T
LUOV.o destroy_signature T
LUOV.o evaluatePublicMap T
LUOV.o extractMessage T
LUOV.o generateKeyPair T
LUOV.o serialize_PublicKey T
LUOV.o serialize_SecretKey T
LUOV.o serialize_signature T
LUOV.o signDocument T
LUOV.o solvePrivateUOVSystem T
LUOV.o verify T
LinearAlgebra.o destroy_matrix T
LinearAlgebra.o getUniqueSolution T
LinearAlgebra.o newMatrix T
LinearAlgebra.o printMatrix T
LinearAlgebra.o rowEchelonAugmented T
LinearAlgebra.o rowOp T
LinearAlgebra.o scaleRow T
LinearAlgebra.o swapRows T
LinearAlgebra.o zeroMatrix T
buffer.o deserialize_uint64_t T
buffer.o newReader T
buffer.o newWriter T
buffer.o readBit T
buffer.o serialize_uint64_t T
buffer.o transcribe T
buffer.o writeBit T
intermediateValues.o printAugmentedMatrix T
intermediateValues.o printEvaluation T
intermediateValues.o printPrivateSolution T
intermediateValues.o printVinegarValues T
intermediateValues.o reportSolutionFound T
keccakrng.o initializeAndAbsorb T
keccakrng.o squeezeVector T
keccakrng.o squeezeuint64_t T

Number of similar (implementation,compiler) pairs: 9, namely:
ImplementationCompiler
T:refclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:refclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:refclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:refclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:refclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:refgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:refgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:refgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:refgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)