Implementation notes: amd64, trident, crypto_sign/rainbow5cclassic963664

Computer: trident
Microarchitecture: amd64; Core 2 65nm (6fb)
Architecture: amd64
CPU ID: GenuineIntel-000006fb-bfebfbff
SUPERCOP version: 20240107
Operation: crypto_sign
Primitive: rainbow5cclassic963664
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
3354306190557 8 52227578 908 1816T:ssse3gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122820231212
3372828141615 8 52193269 924 1784T:ssse3clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
3382958115092 8 52169741 924 1784T:ssse3clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
364915898375 8 52155554 908 1816T:ssse3gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122820231212
375956877693 8 52134483 916 1784T:ssse3clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
390379380151 8 52137858 908 1816T:ssse3gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122820231212
459100061634 8 52124075 916 1784T:ssse3clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
528554749361 8 52113290 900 1816T:ssse3gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122820231212
7076278164539 0 52234402 900 1816T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122820231212
7083934106253 0 52174261 916 1784T:refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
7088523205347 0 52275018 900 1816T:amd64gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122820231212
7560441114931 0 52184061 916 1784T:refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
7574981112856 0 52181885 916 1784T:refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
7599852107896 0 52177813 916 1784T:amd64clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
7625637105742 0 52175509 916 1784T:amd64clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
7658748101424 0 52170333 916 1784T:amd64clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
1065220976653 0 52145922 900 1816T:amd64gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122820231212
1069433660764 0 52128779 908 1784T:amd64clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
1117914563557 0 52132410 900 1816T:amd64gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122820231212
1800156147013 0 52115826 900 1816T:refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122820231212
1832060848377 0 52117522 900 1816T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122820231212
1916149452398 0 52120403 908 1784T:refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
2070983230409 0 5298546 892 1816T:amd64gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122820231212
2394635039095 0 52108259 908 1784T:amd64clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
3458151827289 0 5295298 892 1816T:refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122820231212
4392933934610 0 52103523 908 1784T:refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
blas_comm.c: In file included from blas_comm.c:6:
blas_comm.c: In file included from ./blas.h:25:
blas_comm.c: ./blas_avx2.h:88:17: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'gf256v_add_avx2' that is compiled without support for 'avx'
blas_comm.c: __m256i inp = _mm256_loadu_si256( (__m256i*) (a+i*32) );
blas_comm.c: ^
blas_comm.c: ./blas_avx2.h:88:17: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
blas_comm.c: ./blas_avx2.h:89:17: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'gf256v_add_avx2' that is compiled without support for 'avx'
blas_comm.c: __m256i out = _mm256_loadu_si256( (__m256i*) (accu_b+i*32) );
blas_comm.c: ^
blas_comm.c: ./blas_avx2.h:89:17: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
blas_comm.c: ./blas_avx2.h:91:3: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'gf256v_add_avx2' that is compiled without support for 'avx'
blas_comm.c: _mm256_storeu_si256( (__m256i*) (accu_b+i*32) , out );
blas_comm.c: ^
blas_comm.c: ./blas_avx2.h:91:3: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
blas_comm.c: 6 errors generated.

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
blas_comm.c: In file included from blas_avx2.h:15,
blas_comm.c: from blas.h:25,
blas_comm.c: from blas_comm.c:6:
blas_comm.c: gf16_avx2.h: In function 'linear_transform_8x8_256b':
blas_comm.c: gf16_avx2.h:28:1: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
blas_comm.c: 28 | {
blas_comm.c: | ^
blas_comm.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:51,
blas_comm.c: from blas_avx2.h:10,
blas_comm.c: from blas.h:25,
blas_comm.c: from blas_comm.c:6:
blas_comm.c: blas_avx2.h: In function 'gf256v_add_avx2':
blas_comm.c: /usr/lib/gcc/x86_64-linux-gnu/10/include/avxintrin.h:926:1: error: inlining failed in call to 'always_inline' '_mm256_storeu_si256': target specific option mismatch
blas_comm.c: 926 | _mm256_storeu_si256 (__m256i_u *__P, __m256i __A)
blas_comm.c: | ^~~~~~~~~~~~~~~~~~~
blas_comm.c: In file included from blas.h:25,
blas_comm.c: from blas_comm.c:6:
blas_comm.c: blas_avx2.h:91:3: note: called from here
blas_comm.c: 91 | _mm256_storeu_si256( (__m256i*) (accu_b+i*32) , out );
blas_comm.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
blas_comm.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:51,
blas_comm.c: from blas_avx2.h:10,
blas_comm.c: from blas.h:25,
blas_comm.c: from blas_comm.c:6:
blas_comm.c: /usr/lib/gcc/x86_64-linux-gnu/10/include/avxintrin.h:920:1: error: inlining failed in call to 'always_inline' '_mm256_loadu_si256': target specific option mismatch
blas_comm.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2

Compiler output

Implementation: T:ssse3
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
blas_matrix_ref.c: In file included from blas_matrix_ref.c:6:
blas_matrix_ref.c: In file included from ./blas.h:25:
blas_matrix_ref.c: In file included from ./blas_sse.h:16:
blas_matrix_ref.c: ./gf16_sse.h:34:9: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'linear_transform_8x8_128b' that is compiled without support for 'ssse3'
blas_matrix_ref.c: return _mm_shuffle_epi8(tab_l,v&mask_f)^_mm_shuffle_epi8(tab_h,_mm_srli_epi16(v,4)&mask_f);
blas_matrix_ref.c: ^
blas_matrix_ref.c: ./gf16_sse.h:34:42: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'linear_transform_8x8_128b' that is compiled without support for 'ssse3'
blas_matrix_ref.c: return _mm_shuffle_epi8(tab_l,v&mask_f)^_mm_shuffle_epi8(tab_h,_mm_srli_epi16(v,4)&mask_f);
blas_matrix_ref.c: ^
blas_matrix_ref.c: 2 errors generated.

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ssse3