Implementation notes: amd64, margaux, crypto_sign/rainbow5ccompres963664

Computer: margaux
Microarchitecture: amd64; Core 2 65nm (6fb)
Architecture: amd64
CPU ID: GenuineIntel-000006fb-bfebfbff
SUPERCOP version: 20240425
Operation: crypto_sign
Primitive: rainbow5ccompres963664
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
91518701288687 8 52143324 956 1792T:ssse3clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050920240425
916133441147318 8 52194484 956 1792T:ssse3clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
917785169198247 8 52229612 940 1824T:ssse3gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
92430736875468 8 52129982 948 1792T:ssse3clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050920240425
925508416112578 8 52165228 956 1792T:ssse3clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050920240425
92571144255730 8 52114052 932 1824T:ssse3gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
925934508104027 8 52155884 940 1824T:ssse3gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
93257578585173 8 52137716 940 1824T:ssse3gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
982360137202992 0 52268540 932 1824T:amd64gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
985771262164049 0 52229548 932 1824T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
98606574878779 0 52143972 948 1792T:refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
99533997276060 0 52141156 948 1792T:refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
99721400860164 0 52124772 948 1792T:refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
1003607105102134 0 52166708 948 1792T:amd64clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
1005612905104741 0 52169396 948 1792T:amd64clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
102393258269482 0 52133452 948 1792T:amd64clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
107519423048804 0 52112660 948 1792T:amd64clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
107659033171015 0 52135660 932 1824T:amd64gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
109276446341987 0 52105150 940 1792T:amd64clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
109351989850425 0 52114564 932 1824T:refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
110375795249132 0 52113676 932 1824T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
110514849934771 0 5297942 940 1792T:refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
110723590663047 0 52127236 932 1824T:amd64gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
111075726742183 0 52106044 948 1792T:refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
126550837733553 0 5296908 924 1824T:amd64gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
127459352730378 0 5293588 924 1824T:refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
blas_comm.c: In file included from blas_comm.c:6:
blas_comm.c: In file included from ./blas.h:25:
blas_comm.c: ./blas_avx2.h:88:17: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'gf256v_add_avx2' that is compiled without support for 'avx'
blas_comm.c: __m256i inp = _mm256_loadu_si256( (__m256i*) (a+i*32) );
blas_comm.c: ^
blas_comm.c: ./blas_avx2.h:88:17: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
blas_comm.c: ./blas_avx2.h:89:17: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'gf256v_add_avx2' that is compiled without support for 'avx'
blas_comm.c: __m256i out = _mm256_loadu_si256( (__m256i*) (accu_b+i*32) );
blas_comm.c: ^
blas_comm.c: ./blas_avx2.h:89:17: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
blas_comm.c: ./blas_avx2.h:91:3: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'gf256v_add_avx2' that is compiled without support for 'avx'
blas_comm.c: _mm256_storeu_si256( (__m256i*) (accu_b+i*32) , out );
blas_comm.c: ^
blas_comm.c: ./blas_avx2.h:91:3: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
blas_comm.c: 6 errors generated.

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
blas_comm.c: In file included from blas_avx2.h:15,
blas_comm.c: from blas.h:25,
blas_comm.c: from blas_comm.c:6:
blas_comm.c: gf16_avx2.h: In function 'linear_transform_8x8_256b':
blas_comm.c: gf16_avx2.h:28:1: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
blas_comm.c: 28 | {
blas_comm.c: | ^
blas_comm.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:43,
blas_comm.c: from blas_avx2.h:10,
blas_comm.c: from blas.h:25,
blas_comm.c: from blas_comm.c:6:
blas_comm.c: blas_avx2.h: In function 'gf256v_add_avx2':
blas_comm.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avxintrin.h:933:1: error: inlining failed in call to 'always_inline' '_mm256_storeu_si256': target specific option mismatch
blas_comm.c: 933 | _mm256_storeu_si256 (__m256i_u *__P, __m256i __A)
blas_comm.c: | ^~~~~~~~~~~~~~~~~~~
blas_comm.c: In file included from blas.h:25,
blas_comm.c: from blas_comm.c:6:
blas_comm.c: blas_avx2.h:91:17: note: called from here
blas_comm.c: 91 | _mm256_storeu_si256( (__m256i*) (accu_b+i*32) , out );
blas_comm.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
blas_comm.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:43,
blas_comm.c: from blas_avx2.h:10,
blas_comm.c: from blas.h:25,
blas_comm.c: from blas_comm.c:6:
blas_comm.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avxintrin.h:927:1: error: inlining failed in call to 'always_inline' '_mm256_loadu_si256': target specific option mismatch
blas_comm.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2

Compiler output

Implementation: T:ssse3
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
blas_matrix_ref.c: In file included from blas_matrix_ref.c:6:
blas_matrix_ref.c: In file included from ./blas.h:25:
blas_matrix_ref.c: In file included from ./blas_sse.h:16:
blas_matrix_ref.c: ./gf16_sse.h:34:9: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'linear_transform_8x8_128b' that is compiled without support for 'ssse3'
blas_matrix_ref.c: return _mm_shuffle_epi8(tab_l,v&mask_f)^_mm_shuffle_epi8(tab_h,_mm_srli_epi16(v,4)&mask_f);
blas_matrix_ref.c: ^
blas_matrix_ref.c: ./gf16_sse.h:34:42: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'linear_transform_8x8_128b' that is compiled without support for 'ssse3'
blas_matrix_ref.c: return _mm_shuffle_epi8(tab_l,v&mask_f)^_mm_shuffle_epi8(tab_h,_mm_srli_epi16(v,4)&mask_f);
blas_matrix_ref.c: ^
blas_matrix_ref.c: 2 errors generated.

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ssse3