Implementation notes: amd64, jasper, crypto_sign/luov863256

Computer: jasper
Microarchitecture: amd64; Tremont (906c0)
Architecture: amd64
CPU ID: GenuineIntel-000906c0-20-bfebfbff
SUPERCOP version: 20231107
Operation: crypto_sign
Primitive: luov863256
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
51228246554277 0 0113268 852 1720T:portableclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023060720230530
51236277554789 0 0116644 852 1720T:portableclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023060720230530
51463161554879 0 0115500 852 1720T:portableclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023060720230530
51464897555463 0 0115898 812 1784T:portablegcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023060720230530
51981165547484 0 0108396 836 1720T:portableclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023060720230530
52443991550551 0 0111834 812 1784T:portablegcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023060720230530
53006245548554 0 0109818 812 1784T:portablegcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023060720230530
55127311546023 0 0107314 804 1752T:portablegcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023060720230530
56146412548442 0 0109580 836 1720T:portableclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023060720230530
135924244416511 36 0238452 852 1720T:refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023060720230530
140086319414771 36 0235300 852 1720T:refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023060720230530
140158457416183 36 0239380 852 1720T:refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023060720230530
140520919411309 36 0235162 812 1784T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023060720230530
142339489415549 36 0237554 812 1784T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023060720230530
142342417409306 36 0233084 836 1720T:refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023060720230530
142582405409872 36 0233722 812 1784T:refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023060720230530
144818824408336 36 0231780 836 1720T:refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023060720230530
155633385407446 36 0231386 804 1752T:refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023060720230530

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
LUOV.c: LUOV.c:38:19: error: '__builtin_ia32_permdi256' needs target feature avx2
LUOV.c: __m256i rrrr = _mm256_permute4x64_epi64(_mm256_loadu_si256((__m256i *)&Q1[col++]),0);
LUOV.c: ^
LUOV.c: /usr/lib/llvm-11/lib/clang/11.0.1/include/avx2intrin.h:818:12: note: expanded from macro '_mm256_permute4x64_epi64'
LUOV.c: (__m256i)__builtin_ia32_permdi256((__v4di)(__m256i)(V), (int)(M))
LUOV.c: ^
LUOV.c: LUOV.c:38:44: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c: __m256i rrrr = _mm256_permute4x64_epi64(_mm256_loadu_si256((__m256i *)&Q1[col++]),0);
LUOV.c: ^
LUOV.c: LUOV.c:38:44: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:43:69: error: always_inline function '_mm256_setzero_pd' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c: *((__m256i *)&TempMat[i][k*8+4]) ^= (__m256i) _mm256_blendv_pd(_mm256_setzero_pd(),(__m256d) rrrr,(__m256d)TJ);
LUOV.c: ^
LUOV.c: LUOV.c:43:69: error: AVX vector return of type '__m256d' (vector of 4 'double' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:43:52: error: always_inline function '_mm256_blendv_pd' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c: *((__m256i *)&TempMat[i][k*8+4]) ^= (__m256i) _mm256_blendv_pd(_mm256_setzero_pd(),(__m256d) rrrr,(__m256d)TJ);
LUOV.c: ^
LUOV.c: LUOV.c:43:52: error: AVX vector argument of type '__m256d' (vector of 4 'double' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:44:10: error: always_inline function '_mm256_slli_epi64' requires target feature 'avx2', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx2'
LUOV.c: TJ = _mm256_slli_epi64(TJ,4);
LUOV.c: ^
LUOV.c: LUOV.c:44:10: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:46:66: error: always_inline function '_mm256_setzero_pd' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c: *((__m256i *)&TempMat[i][k*8]) ^= (__m256i) _mm256_blendv_pd(_mm256_setzero_pd(),(__m256d) rrrr,(__m256d)TJ);
LUOV.c: ^
LUOV.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
LUOV.c: LUOV.c: In function 'calculateQ2':
LUOV.c: LUOV.c:38:12: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
LUOV.c: 38 | __m256i rrrr = _mm256_permute4x64_epi64(_mm256_loadu_si256((__m256i *)&Q1[col++]),0);
LUOV.c: | ^~~~
LUOV.c: In file included from LinearAlgebra.h:9,
LUOV.c: from LUOV.h:13,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h: In function 'scalarMul_ct':
LUOV.c: AVX_Operations.h:529:6: note: the ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
LUOV.c: 529 | void scalarMul_ct(__m256i *Out, __m256i A, FELT b){
LUOV.c: | ^~~~~~~~~~~~
LUOV.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:53,
LUOV.c: from LUOV.h:7,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h: In function 'addScalarProductAVX':
LUOV.c: /usr/lib/gcc/x86_64-linux-gnu/10/include/avx2intrin.h:186:1: error: inlining failed in call to 'always_inline' '_mm256_andnot_si256': target specific option mismatch
LUOV.c: 186 | _mm256_andnot_si256 (__m256i __A, __m256i __B)
LUOV.c: | ^~~~~~~~~~~~~~~~~~~
LUOV.c: In file included from LinearAlgebra.h:9,
LUOV.c: from LUOV.h:13,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h:73:9: note: called from here
LUOV.c: 73 | avx2 = _mm256_andnot_si256(avx2,aa);
LUOV.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
LUOV.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:53,
LUOV.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2