Implementation notes: amd64, hunsnivy, crypto_sign/luov8117404
Computer: hunsnivy
Microarchitecture: amd64; Ivy Bridge+AES (306a9)
Architecture: amd64
CPU ID: GenuineIntel-000306a9-bfebfbff
SUPERCOP version: 20240625
Operation: crypto_sign
Primitive: luov8117404
Time | Object size | Test size | Implementation | Compiler | Benchmark date | SUPERCOP version |
90389243 | 556201 0 0 | 118204 852 1720 | T:portable | clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
90487444 | 556314 0 0 | 116124 852 1720 | T:portable | clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
91100010 | 558675 0 0 | 118730 812 1784 | T:portable | gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
91241177 | 556143 0 0 | 116588 852 1720 | T:portable | clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
95456779 | 553067 0 0 | 114330 812 1784 | T:portable | gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
98022975 | 549676 0 0 | 110452 836 1720 | T:portable | clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
107246498 | 551672 0 0 | 112794 812 1784 | T:portable | gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
107939329 | 547818 0 0 | 108858 804 1752 | T:portable | gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
133564955 | 550703 0 0 | 111596 836 1720 | T:portable | clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
866446208 | 416121 36 0 | 237852 852 1720 | T:ref | clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
871213574 | 416052 36 0 | 239388 852 1720 | T:ref | clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
871425142 | 415980 36 0 | 237244 852 1720 | T:ref | clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
917823855 | 411819 36 0 | 235554 812 1784 | T:ref | gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
920359453 | 415883 36 0 | 237730 812 1784 | T:ref | gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
936735571 | 410638 36 0 | 234290 812 1784 | T:ref | gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
978853733 | 410379 36 0 | 233916 836 1720 | T:ref | clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
995615880 | 409432 36 0 | 232732 836 1720 | T:ref | clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
1021042703 | 408414 36 0 | 232178 804 1752 | T:ref | gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240703 | 20240625 |
Compiler output
LUOV.c: LUOV.c:110:17: error: '__builtin_ia32_permti256' needs target feature avx2
LUOV.c: __m256i rr = _mm256_permute2x128_si256(_mm256_loadu_si256((__m256i *)&r),_mm256_setzero_si256(),0);
LUOV.c: ^
LUOV.c: /usr/lib/llvm-11/lib/clang/11.0.1/include/avx2intrin.h:821:12: note: expanded from macro '_mm256_permute2x128_si256'
LUOV.c: (__m256i)__builtin_ia32_permti256((__m256i)(V1), (__m256i)(V2), (int)(M))
LUOV.c: ^
LUOV.c: LUOV.c:117:20: error: always_inline function '_mm256_cmpeq_epi8' requires target feature 'avx2', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx2'
LUOV.c: __m256i t1t2 = _mm256_cmpeq_epi8(tttt & masks[0],_mm256_setzero_si256());
LUOV.c: ^
LUOV.c: LUOV.c:118:39: error: always_inline function '_mm256_andnot_si256' requires target feature 'avx2', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx2'
LUOV.c: *((__m256i *)&TempMat[i][k+0]) ^= _mm256_andnot_si256(t1t2,rr);
LUOV.c: ^
LUOV.c: LUOV.c:120:20: error: always_inline function '_mm256_cmpeq_epi8' requires target feature 'avx2', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx2'
LUOV.c: __m256i t3t4 = _mm256_cmpeq_epi8(tttt & masks[1],_mm256_setzero_si256());
LUOV.c: ^
LUOV.c: LUOV.c:121:39: error: always_inline function '_mm256_andnot_si256' requires target feature 'avx2', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx2'
LUOV.c: *((__m256i *)&TempMat[i][k+2]) ^= _mm256_andnot_si256(t3t4,rr);
LUOV.c: ^
LUOV.c: LUOV.c:123:20: error: always_inline function '_mm256_cmpeq_epi8' requires target feature 'avx2', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx2'
LUOV.c: __m256i t5t6 = _mm256_cmpeq_epi8(tttt & masks[2],_mm256_setzero_si256());
LUOV.c: ^
LUOV.c: LUOV.c:124:39: error: always_inline function '_mm256_andnot_si256' requires target feature 'avx2', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx2'
LUOV.c: *((__m256i *)&TempMat[i][k+4]) ^= _mm256_andnot_si256(t5t6,rr);
LUOV.c: ^
LUOV.c: LUOV.c:126:20: error: always_inline function '_mm256_cmpeq_epi8' requires target feature 'avx2', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx2'
LUOV.c: ...
Number of similar (implementation,compiler) pairs: 4, namely:
Implementation | Compiler |
T:avx2 | clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
T:avx2 | clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
T:avx2 | clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
T:avx2 | clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
Compiler output
LUOV.c: LUOV.c:110:17: error: '__builtin_ia32_permti256' needs target feature avx2
LUOV.c: __m256i rr = _mm256_permute2x128_si256(_mm256_loadu_si256((__m256i *)&r),_mm256_setzero_si256(),0);
LUOV.c: ^
LUOV.c: /usr/lib/llvm-11/lib/clang/11.0.1/include/avx2intrin.h:821:12: note: expanded from macro '_mm256_permute2x128_si256'
LUOV.c: (__m256i)__builtin_ia32_permti256((__m256i)(V1), (__m256i)(V2), (int)(M))
LUOV.c: ^
LUOV.c: LUOV.c:110:43: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c: __m256i rr = _mm256_permute2x128_si256(_mm256_loadu_si256((__m256i *)&r),_mm256_setzero_si256(),0);
LUOV.c: ^
LUOV.c: LUOV.c:110:43: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:110:77: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c: __m256i rr = _mm256_permute2x128_si256(_mm256_loadu_si256((__m256i *)&r),_mm256_setzero_si256(),0);
LUOV.c: ^
LUOV.c: LUOV.c:110:77: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:115:20: error: always_inline function '_mm256_set1_epi8' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c: __m256i tttt = _mm256_set1_epi8(t[k/8]);
LUOV.c: ^
LUOV.c: LUOV.c:115:20: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:117:54: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx'
LUOV.c: __m256i t1t2 = _mm256_cmpeq_epi8(tttt & masks[0],_mm256_setzero_si256());
LUOV.c: ^
LUOV.c: LUOV.c:117:54: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
LUOV.c: LUOV.c:117:20: error: always_inline function '_mm256_cmpeq_epi8' requires target feature 'avx2', but would be inlined into function 'calculateQ2' that is compiled without support for 'avx2'
LUOV.c: __m256i t1t2 = _mm256_cmpeq_epi8(tttt & masks[0],_mm256_setzero_si256());
LUOV.c: ^
LUOV.c: ...
Number of similar (implementation,compiler) pairs: 1, namely:
Implementation | Compiler |
T:avx2 | clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
Compiler output
LUOV.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:53,
LUOV.c: from LUOV.h:7,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h: In function 'addScalarProductAVX':
LUOV.c: /usr/lib/gcc/x86_64-linux-gnu/10/include/avx2intrin.h:186:1: error: inlining failed in call to 'always_inline' '_mm256_andnot_si256': target specific option mismatch
LUOV.c: 186 | _mm256_andnot_si256 (__m256i __A, __m256i __B)
LUOV.c: | ^~~~~~~~~~~~~~~~~~~
LUOV.c: In file included from LinearAlgebra.h:9,
LUOV.c: from LUOV.h:13,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h:87:9: note: called from here
LUOV.c: 87 | avx4 = _mm256_andnot_si256(avx4,aa);
LUOV.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
LUOV.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:53,
LUOV.c: from LUOV.h:7,
LUOV.c: from LUOV.c:1:
LUOV.c: /usr/lib/gcc/x86_64-linux-gnu/10/include/avx2intrin.h:231:1: error: inlining failed in call to 'always_inline' '_mm256_cmpeq_epi8': target specific option mismatch
LUOV.c: 231 | _mm256_cmpeq_epi8 (__m256i __A, __m256i __B)
LUOV.c: | ^~~~~~~~~~~~~~~~~
LUOV.c: In file included from LinearAlgebra.h:9,
LUOV.c: from LUOV.h:13,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h:86:9: note: called from here
LUOV.c: 86 | avx4 = _mm256_cmpeq_epi8(avx4,_mm256_setzero_si256());
LUOV.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LUOV.c: ...
Number of similar (implementation,compiler) pairs: 4, namely:
Implementation | Compiler |
T:avx2 | gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
T:avx2 | gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
T:avx2 | gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
T:avx2 | gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
Compiler output
F64Field.c: F64Field.c: In function 'f64addInPlace':
F64Field.c: F64Field.c:43:4: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
F64Field.c: 43 | *((uint64_t *) a->coef) ^= *((uint64_t *) b->coef);
F64Field.c: | ~^~~~~~~~~~~~~~~~~~~~~
F64Field.c: F64Field.c:43:31: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
F64Field.c: 43 | *((uint64_t *) a->coef) ^= *((uint64_t *) b->coef);
F64Field.c: | ~^~~~~~~~~~~~~~~~~~~~~
F80Field.c: F80Field.c: In function 'f80addInPlace':
F80Field.c: F80Field.c:55:4: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
F80Field.c: 55 | *((uint64_t *) a->coef) ^= *((uint64_t *) b->coef);
F80Field.c: | ~^~~~~~~~~~~~~~~~~~~~~
F80Field.c: F80Field.c:55:31: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
F80Field.c: 55 | *((uint64_t *) a->coef) ^= *((uint64_t *) b->coef);
F80Field.c: | ~^~~~~~~~~~~~~~~~~~~~~
Number of similar (implementation,compiler) pairs: 3, namely:
Implementation | Compiler |
T:portable | gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
T:portable | gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
T:portable | gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |