Implementation notes: amd64, beelink, crypto_core/rainbowcalsecret963664

Computer: beelink
Microarchitecture: amd64; Zen3 (a50f00)
Architecture: amd64
CPU ID: AuthenticAMD-00a50f00-178bfbff
SUPERCOP version: 20221122
Operation: crypto_core
Primitive: rainbowcalsecret963664

Time	Object size	Test size	Implementation	Compiler	Benchmark date	SUPERCOP version
38402700	38637 8 0	51239 828 992	`avx2`	`gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230103	20221122
39072881	65450 8 0	80031 828 992	`avx2`	`gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230103	20221122
39202396	39410 8 0	51369 884 928	`avx2`	`clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
39519628	53146 8 0	66135 828 992	`avx2`	`gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230103	20221122
39843633	36955 8 0	49679 876 1024	`avx2`	`clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
40089757	282219 8 0	299849 884 960	`avx2`	`clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
40436003	118426 8 0	136049 884 960	`avx2`	`clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
41814400	35338 8 0	47967 828 992	`ssse3`	`gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230103	20221122
42063611	58433 8 0	72999 828 992	`ssse3`	`gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230103	20221122
42234274	45623 8 0	58607 828 992	`ssse3`	`gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230103	20221122
44272022	208033 8 0	221417 884 960	`ssse3`	`clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
44344563	33471 8 0	45129 884 928	`ssse3`	`clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
45314387	32655 8 0	44535 876 1024	`ssse3`	`clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
46036416	102534 8 0	115897 884 960	`ssse3`	`clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
61418900	25250 8 0	36850 804 960	`avx2`	`gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230103	20221122
68268593	25051 8 0	36658 804 960	`ssse3`	`gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230103	20221122
116984921	74543 0 0	88957 804 992	`amd64`	`gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230103	20221122
123975691	43796 0 0	58253 804 992	`ref`	`gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230103	20221122
142802526	51279 0 0	68551 860 960	`ref`	`clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
144666159	84061 0 0	101159 860 960	`amd64`	`clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
145119955	35019 0 0	52343 860 960	`ref`	`clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
171391395	43113 0 0	60391 860 960	`amd64`	`clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
216866278	19170 0 0	32327 860 928	`ref`	`clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
218122652	28955 0 0	41983 860 928	`amd64`	`clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
397450305	11891 0 0	24333 852 1024	`amd64`	`clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
398070809	26031 0 0	38861 804 992	`amd64`	`gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230103	20221122
439427374	25271 0 0	37701 804 992	`amd64`	`gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230103	20221122
450591901	12659 0 0	24327 860 928	`amd64`	`clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
673230601	10598 0 0	23053 852 1024	`ref`	`clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
685934659	14826 0 0	27277 804 992	`ref`	`gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230103	20221122
690694800	15374 0 0	28237 804 992	`ref`	`gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230103	20221122
720584703	11180 0 0	22544 780 960	`amd64`	`gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230103	20221122
748401146	11257 0 0	22951 860 928	`ref`	`clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230103	20221122
1369165162	10257 0 0	21632 780 960	`ref`	`gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230103	20221122

Compiler output

Implementation: avx2
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE

blas_comm.c: In file included from blas_comm.c:6:
blas_comm.c: In file included from ./blas.h:25:
blas_comm.c: ./blas_avx2.h:88:17: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'gf256v_add_avx2' that is compiled without support for 'avx'
blas_comm.c: __m256i inp = _mm256_loadu_si256( (__m256i*) (a+i*32) );
blas_comm.c: ^
blas_comm.c: ./blas_avx2.h:88:17: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
blas_comm.c: ./blas_avx2.h:89:17: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'gf256v_add_avx2' that is compiled without support for 'avx'
blas_comm.c: __m256i out = _mm256_loadu_si256( (__m256i*) (accu_b+i*32) );
blas_comm.c: ^
blas_comm.c: ./blas_avx2.h:89:17: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
blas_comm.c: ./blas_avx2.h:91:3: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'gf256v_add_avx2' that is compiled without support for 'avx'
blas_comm.c: _mm256_storeu_si256( (__m256i*) (accu_b+i*32) , out );
blas_comm.c: ^
blas_comm.c: ./blas_avx2.h:91:3: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
blas_comm.c: 6 errors generated.

Number of similar (compiler,implementation) pairs: 1, namely:

Compiler	Implementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE	avx2

Compiler output

Implementation: ssse3
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE

parallel_matrix_op.c: In file included from parallel_matrix_op.c:8:
parallel_matrix_op.c: In file included from ./blas.h:25:
parallel_matrix_op.c: In file included from ./blas_sse.h:16:
parallel_matrix_op.c: ./gf16_sse.h:34:9: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'linear_transform_8x8_128b' that is compiled without support for 'ssse3'
parallel_matrix_op.c: return _mm_shuffle_epi8(tab_l,v&mask_f)^_mm_shuffle_epi8(tab_h,_mm_srli_epi16(v,4)&mask_f);
parallel_matrix_op.c: ^
parallel_matrix_op.c: ./gf16_sse.h:34:42: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'linear_transform_8x8_128b' that is compiled without support for 'ssse3'
parallel_matrix_op.c: return _mm_shuffle_epi8(tab_l,v&mask_f)^_mm_shuffle_epi8(tab_h,_mm_srli_epi16(v,4)&mask_f);
parallel_matrix_op.c: ^
parallel_matrix_op.c: 2 errors generated.

Number of similar (compiler,implementation) pairs: 1, namely:

Compiler	Implementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE	ssse3