Implementation notes: amd64, beelink, crypto_encrypt/lotus192

Computer: beelink
Microarchitecture: amd64; Zen3 (a50f00)
Architecture: amd64
CPU ID: AuthenticAMD-00a50f00-178bfbff
SUPERCOP version: 20221122
Operation: crypto_encrypt
Primitive: lotus192

Time	Object size	Test size	Implementation	Compiler	Benchmark date	SUPERCOP version
371817	25434 0 8	54970 940 1824	`T:avx2`	`gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230104	20221122
460864	39100 0 8	68650 940 1824	`T:opt`	`gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230104	20221122
628446	30974 0 8	60442 940 1824	`T:ref`	`gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230104	20221122
1512844	162485 0 8	195717 980 1792	`T:opt`	`clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230104	20221122
1704179	47220 0 8	76197 980 1760	`T:opt`	`clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230104	20221122
1704524	164733 0 8	197981 980 1792	`T:opt`	`clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230104	20221122
1826217	29554 0 8	58461 980 1760	`T:ref`	`clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230104	20221122
3052419	13829 0 8	41522 940 1824	`T:avx2`	`gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230104	20221122
3086753	11825 0 8	39579 972 1824	`T:opt`	`clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230104	20221122
3093443	12064 0 8	39202 940 1824	`T:avx2`	`gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230104	20221122
5479224	15265 0 8	42309 980 1760	`T:opt`	`clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230104	20221122
5504384	8458 0 8	36131 972 1824	`T:ref`	`clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230104	20221122
5588357	10707 0 8	38322 940 1824	`T:ref`	`gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230104	20221122
5717580	12124 0 8	39242 940 1824	`T:opt`	`gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230104	20221122
5928272	14374 0 8	42066 940 1824	`T:opt`	`gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230104	20221122
5969706	9790 0 8	36757 980 1760	`T:ref`	`clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20230104	20221122
6262281	11294 0 8	37426 932 1824	`T:avx2`	`gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230104	20221122
6351426	9511 0 8	36554 940 1824	`T:ref`	`gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230104	20221122
10208085	8846 0 8	34882 932 1824	`T:ref`	`gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230104	20221122
10360284	11377 0 8	37498 932 1824	`T:opt`	`gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20230104	20221122

Test failure

Implementation: T:ref
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE

error 111
crypto_encrypt_open returns nonzero

Number of similar (compiler,implementation) pairs: 2, namely:

Compiler	Implementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE	T:ref
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE	T:ref

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE

lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:211:5: warning: array index 4 is past the end of the array (which contains 4 elements) [-Warray-bounds]
lwe-arithmetics_avx2.c: c[4] = _mm256_loadu_si256((__m256i*)(C + 64));
lwe-arithmetics_avx2.c: ^ ~
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:204:3: note: array 'c' declared here
lwe-arithmetics_avx2.c: __m256i a, b[4], c[4];
lwe-arithmetics_avx2.c: ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:212:5: warning: array index 5 is past the end of the array (which contains 4 elements) [-Warray-bounds]
lwe-arithmetics_avx2.c: c[5] = _mm256_loadu_si256((__m256i*)(C + 80));
lwe-arithmetics_avx2.c: ^ ~
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:204:3: note: array 'c' declared here
lwe-arithmetics_avx2.c: __m256i a, b[4], c[4];
lwe-arithmetics_avx2.c: ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:220:7: warning: array index 4 is past the end of the array (which contains 4 elements) [-Warray-bounds]
lwe-arithmetics_avx2.c: b[4] = _mm256_loadu_si256((__m256i*)(p + 64));
lwe-arithmetics_avx2.c: ^ ~
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:204:3: note: array 'b' declared here
lwe-arithmetics_avx2.c: __m256i a, b[4], c[4];
lwe-arithmetics_avx2.c: ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:221:7: warning: array index 5 is past the end of the array (which contains 4 elements) [-Warray-bounds]
lwe-arithmetics_avx2.c: b[5] = _mm256_loadu_si256((__m256i*)(p + 80));
lwe-arithmetics_avx2.c: ^ ~
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:204:3: note: array 'b' declared here
lwe-arithmetics_avx2.c: __m256i a, b[4], c[4];
lwe-arithmetics_avx2.c: ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:226:33: warning: array index 4 is past the end of the array (which contains 4 elements) [-Warray-bounds]
lwe-arithmetics_avx2.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:

Compiler	Implementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE	T:avx2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE	T:avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE	T:avx2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE	T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE

lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:158:12: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx'
lwe-arithmetics_avx2.c: c[0] = _mm256_setzero_si256();
lwe-arithmetics_avx2.c: ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:158:12: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:159:12: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx'
lwe-arithmetics_avx2.c: c[1] = _mm256_setzero_si256();
lwe-arithmetics_avx2.c: ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:159:12: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:160:12: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx'
lwe-arithmetics_avx2.c: c[2] = _mm256_setzero_si256();
lwe-arithmetics_avx2.c: ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:160:12: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:161:12: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx'
lwe-arithmetics_avx2.c: c[3] = _mm256_setzero_si256();
lwe-arithmetics_avx2.c: ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:161:12: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:162:12: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx'
lwe-arithmetics_avx2.c: c[4] = _mm256_setzero_si256();
lwe-arithmetics_avx2.c: ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:162:12: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:163:12: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx'
lwe-arithmetics_avx2.c: c[5] = _mm256_setzero_si256();
lwe-arithmetics_avx2.c: ^
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:163:12: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
lwe-arithmetics_avx2.c: lwe-arithmetics_avx2.c:166:11: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'submat_negmul' that is compiled without support for 'avx'
lwe-arithmetics_avx2.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:

Compiler	Implementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE	T:avx2