Implementation notes: amd64, zen3, crypto_core/invsntrup761

Computer: zen3
Architecture: amd64
CPU ID: AuthenticAMD-00a20f10-178bfbff
SUPERCOP version: 20211108
Operation: crypto_core
Primitive: invsntrup761

Time	Object size	Test size	Implementation	Compiler	Benchmark date	SUPERCOP version
428674	276371 0 0	289840 860 952	`jumpdivsteps`	`clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20220206	20211108
442020	205664 0 0	218006 852 920	`jumpdivsteps`	`clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20220206	20211108
443879	242832 0 0	256496 836 984	`jumpdivsteps`	`gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20220206	20211108
475801	236015 0 0	241728 860 920	`jumpdivsteps`	`clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20220206	20211108
477050	255269 0 0	267120 836 984	`jumpdivsteps`	`gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20220206	20211108
486371	278672 0 0	292384 860 952	`jumpdivsteps`	`clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20220206	20211108
580263	263431 0 0	274912 836 984	`jumpdivsteps`	`gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20220206	20211108
606185	201966 0 0	212523 820 952	`jumpdivsteps`	`gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20220206	20211108
689498	8909 0 0	29136 860 952	`avx`	`clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20220206	20211108
691592	8909 0 0	29376 860 952	`avx`	`clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20220206	20211108
698618	1757 0 0	14216 860 920	`avx`	`clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20220206	20211108
720555	1612 0 0	13838 852 920	`avx`	`clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20220206	20211108
754452	3913 0 0	18360 836 984	`avx`	`gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20220206	20211108
772843	1905 0 0	14464 836 984	`avx`	`gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20220206	20211108
818633	1790 0 0	13887 828 984	`avx`	`gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20220206	20211108
893953	1443 0 0	12555 820 952	`avx`	`gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20220206	20211108
5444212	4114 0 0	18626 844 984	`ref`	`gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20220206	20211108
23450542	9245 0 0	29558 868 952	`ref`	`clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20220206	20211108
24347830	9245 0 0	29798 868 952	`ref`	`clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20220206	20211108
33282779	3057 0 0	18182 868 920	`ref`	`clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20220206	20211108
38206933	1210 0 0	13726 868 920	`ref`	`clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20220206	20211108
39208669	1143 0 0	13199 828 984	`ref`	`gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20220206	20211108
39825054	1181 0 0	13484 860 920	`ref`	`clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20220206	20211108
45810195	931 0 0	12045 828 952	`ref`	`gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20220206	20211108
47010419	1039 0 0	13618 844 984	`ref`	`gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20220206	20211108

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE

recip.c: recip.c:94:19: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx'
recip.c: __m256i f0vec = _mm256_set1_epi16(f0);
recip.c: ^
recip.c: recip.c:94:19: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
recip.c: recip.c:95:19: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx'
recip.c: __m256i g0vec = _mm256_set1_epi16(g0);
recip.c: ^
recip.c: recip.c:95:19: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
recip.c: recip.c:96:48: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx'
recip.c: __m256i f0vecqinv = _mm256_mullo_epi16(f0vec,qinvvec);
recip.c: ^
recip.c: recip.c:80:17: note: expanded from macro 'qinvvec'
recip.c: #define qinvvec _mm256_set1_epi16(qinv)
recip.c: ^
recip.c: recip.c:96:48: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
recip.c: recip.c:80:17: note: expanded from macro 'qinvvec'
recip.c: #define qinvvec _mm256_set1_epi16(qinv)
recip.c: ^
recip.c: recip.c:96:23: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx2'
recip.c: __m256i f0vecqinv = _mm256_mullo_epi16(f0vec,qinvvec);
recip.c: ^
recip.c: recip.c:96:23: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
recip.c: recip.c:97:48: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx'
recip.c: __m256i g0vecqinv = _mm256_mullo_epi16(g0vec,qinvvec);
recip.c: ^
recip.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:

Compiler	Implementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE	avx

Compiler output

Implementation: jumpdivsteps
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE

avx3-512.c: avx3-512.c:419:36: error: invalid output size for constraint '+x'
avx3-512.c: __asm__("vpsubw %1,%0,%0" : "+x"(a),"+x"(b));
avx3-512.c: ^
avx3-512.c: avx3-512.c:425:36: error: invalid output size for constraint '+x'
avx3-512.c: __asm__("vpaddw %1,%0,%0" : "+x"(a),"+x"(b));
avx3-512.c: ^
avx3-512.c: 2 errors generated.

Number of similar (compiler,implementation) pairs: 1, namely:

Compiler	Implementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE	jumpdivsteps