Implementation notes: amd64, jasper, crypto_sign/rainbow1a

Computer: jasper
Microarchitecture: amd64; Tremont (906c0)
Architecture: amd64
CPU ID: GenuineIntel-000906c0-20-bfebfbff
SUPERCOP version: 20240425
Operation: crypto_sign
Primitive: rainbow1a
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
125971436172 0 104862855740 916 1050360T:portableclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024043020240425
130051926913 0 104862844570 908 1050360T:portableclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024043020240425
130061144744 0 104862865420 916 1050360T:portableclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024043020240425
131213247583 0 104862869468 916 1050360T:portableclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024043020240425
136153418849 0 104864437708 892 1050424T:portablegcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024043020240425
142905019525 0 104864437972 892 1050424T:portablegcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024043020240425
26909639027 0 104864426565 892 1050424T:portablegcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024043020240425
280121511788 0 104862830314 908 1050360T:portableclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024043020240425
5039616132260 0 1048644152284 892 1050424T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024043020240425
639936558701 0 104862879620 916 1050360T:refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024043020240425
639991050666 0 104862869252 916 1050360T:refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024043020240425
669174461324 0 104862881044 916 1050360T:refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050120240425
1359709421283 0 104864440116 892 1050424T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024043020240425
1439957920735 0 104864439172 892 1050424T:refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024043020240425
1461884137267 0 104862854410 908 1050360T:refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024043020240425
232022678533 0 104864425973 892 1050424T:refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024043020240425
3024711911228 0 104862829770 908 1050360T:refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050120240425

Test failure

Implementation: T:portable
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE
error 111

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:portable

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
mpkc.c: In file included from mpkc.c:4:
mpkc.c: In file included from ./blas.h:21:
mpkc.c: ./blas_avx2.h:59:18: error: always_inline function '_mm256_load_si256' requires target feature 'avx', but would be inlined into function 'gf16v_madd_avx2' that is compiled without support for 'avx'
mpkc.c: __m256i m_tab = _mm256_load_si256( (__m256i*) (__gf16_mul + 32*b) );
mpkc.c: ^
mpkc.c: ./blas_avx2.h:59:18: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
mpkc.c: ./blas_avx2.h:60:15: error: '__builtin_ia32_permti256' needs target feature avx2
mpkc.c: __m256i ml = _mm256_permute2x128_si256( m_tab , m_tab , 0 );
mpkc.c: ^
mpkc.c: /usr/lib/llvm-11/lib/clang/11.0.1/include/avx2intrin.h:821:12: note: expanded from macro '_mm256_permute2x128_si256'
mpkc.c: (__m256i)__builtin_ia32_permti256((__m256i)(V1), (__m256i)(V2), (int)(M))
mpkc.c: ^
mpkc.c: In file included from mpkc.c:4:
mpkc.c: In file included from ./blas.h:21:
mpkc.c: ./blas_avx2.h:61:15: error: '__builtin_ia32_permti256' needs target feature avx2
mpkc.c: __m256i mh = _mm256_permute2x128_si256( m_tab , m_tab , 0x11 );
mpkc.c: ^
mpkc.c: /usr/lib/llvm-11/lib/clang/11.0.1/include/avx2intrin.h:821:12: note: expanded from macro '_mm256_permute2x128_si256'
mpkc.c: (__m256i)__builtin_ia32_permti256((__m256i)(V1), (__m256i)(V2), (int)(M))
mpkc.c: ^
mpkc.c: In file included from mpkc.c:4:
mpkc.c: In file included from ./blas.h:21:
mpkc.c: ./blas_avx2.h:62:17: error: always_inline function '_mm256_load_si256' requires target feature 'avx', but would be inlined into function 'gf16v_madd_avx2' that is compiled without support for 'avx'
mpkc.c: __m256i mask = _mm256_load_si256( (__m256i*) __mask_low );
mpkc.c: ^
mpkc.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
blas.c: In file included from blas_avx2.h:11,
blas.c: from blas.h:21,
blas.c: from blas.c:1:
blas.c: gf16_avx2.h: In function 'tbl32_gf4_x2':
blas.c: gf16_avx2.h:25:1: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
blas.c: 25 | {
blas.c: | ^
mpkc.c: In file included from blas_avx2.h:11,
mpkc.c: from blas.h:21,
mpkc.c: from mpkc.c:4:
mpkc.c: gf16_avx2.h: In function 'tbl32_gf4_x2':
mpkc.c: gf16_avx2.h:25:1: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
mpkc.c: 25 | {
mpkc.c: | ^
mpkc.c: gf16_avx2.h: In function 'tbl32_gf16_log':
mpkc.c: gf16_avx2.h:70:23: note: the ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
mpkc.c: 70 | static inline __m256i tbl32_gf16_log( __m256i a )
mpkc.c: | ^~~~~~~~~~~~~~
mpkc.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:53,
mpkc.c: from blas_avx2.h:6,
mpkc.c: from blas.h:21,
mpkc.c: from mpkc.c:4:
mpkc.c: /usr/lib/gcc/x86_64-linux-gnu/10/include/avx2intrin.h:588:1: error: inlining failed in call to 'always_inline' '_mm256_shuffle_epi8': target specific option mismatch
mpkc.c: 588 | _mm256_shuffle_epi8 (__m256i __X, __m256i __Y)
mpkc.c: | ^~~~~~~~~~~~~~~~~~~
mpkc.c: In file included from blas_avx2.h:11,
mpkc.c: from blas.h:21,
mpkc.c: from mpkc.c:4:
mpkc.c: gf16_avx2.h:73:9: note: called from here
mpkc.c: 73 | return _mm256_shuffle_epi8(tab_l,a);
mpkc.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
mpkc.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:51,
mpkc.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2