Implementation notes: amd64, intelnuci7, crypto_core/invsntrup761

Computer: intelnuci7
Architecture: amd64
CPU ID: GenuineIntel-000806e9-bfebfbff
SUPERCOP version: 20211108
Operation: crypto_core
Primitive: invsntrup761

Time	Object size	Test size	Implementation	Compiler	Benchmark date	SUPERCOP version
583536	257951 0 0	265889 776 776	`jumpdivsteps`	`clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20211109	20211108
583990	247816 0 0	255729 776 776	`jumpdivsteps`	`clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20211109	20211108
584536	247816 0 0	255729 776 776	`jumpdivsteps`	`clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20211109	20211108
596404	217589 0 0	228663 768 760	`jumpdivsteps`	`clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20211109	20211108
659630	292657 0 0	307678 776 832	`jumpdivsteps`	`gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20211109	20211108
669452	265187 0 0	276837 768 832	`jumpdivsteps`	`gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20211109	20211108
687140	3121 0 0	17377 776 776	`avx`	`clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20211109	20211108
701986	3121 0 0	17377 776 776	`avx`	`clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20211109	20211108
703904	5841 0 0	20129 776 776	`avx`	`clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20211109	20211108
760002	1505 0 0	12119 768 760	`avx`	`clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20211109	20211108
762906	195800 0 0	206581 760 800	`jumpdivsteps`	`gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20211109	20211108
765234	271954 0 0	283501 768 832	`jumpdivsteps`	`gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20211109	20211108
794188	1625 0 0	12565 760 800	`avx`	`gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20211109	20211108
965592	4416 0 0	19710 776 832	`avx`	`gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20211109	20211108
1009040	1936 0 0	13829 768 832	`avx`	`gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20211109	20211108
1014494	1841 0 0	13773 768 832	`avx`	`gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20211109	20211108
9593016	6346 0 0	21720 784 832	`ref`	`gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20211109	20211108
25938794	6438 0 0	20697 776 776	`ref`	`clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20211109	20211108
25941590	3285 0 0	17513 776 776	`ref`	`clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20211109	20211108
25966402	3285 0 0	17513 776 776	`ref`	`clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20211109	20211108
33627720	4658 0 0	18201 776 760	`ref`	`clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20211109	20211108
36029744	1014 0 0	12909 768 832	`ref`	`gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20211109	20211108
36900002	1083 0 0	12925 768 832	`ref`	`gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20211109	20211108
37791636	1102 0 0	11703 768 760	`ref`	`clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE`	20211109	20211108
41705966	916 0 0	11797 760 800	`ref`	`gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE`	20211109	20211108

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE

recip.c: recip.c:94:19: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'sse4.2'
recip.c: __m256i f0vec = _mm256_set1_epi16(f0);
recip.c: ^
recip.c: recip.c:95:19: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'sse4.2'
recip.c: __m256i g0vec = _mm256_set1_epi16(g0);
recip.c: ^
recip.c: recip.c:96:23: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx2'
recip.c: __m256i f0vecqinv = _mm256_mullo_epi16(f0vec,qinvvec);
recip.c: ^
recip.c: recip.c:96:48: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'sse4.2'
recip.c: __m256i f0vecqinv = _mm256_mullo_epi16(f0vec,qinvvec);
recip.c: ^
recip.c: recip.c:80:17: note: expanded from macro 'qinvvec'
recip.c: #define qinvvec _mm256_set1_epi16(qinv)
recip.c: ^
recip.c: recip.c:97:23: error: always_inline function '_mm256_mullo_epi16' requires target feature 'avx2', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'avx2'
recip.c: __m256i g0vecqinv = _mm256_mullo_epi16(g0vec,qinvvec);
recip.c: ^
recip.c: recip.c:97:48: error: always_inline function '_mm256_set1_epi16' requires target feature 'sse4.2', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'sse4.2'
recip.c: __m256i g0vecqinv = _mm256_mullo_epi16(g0vec,qinvvec);
recip.c: ^
recip.c: recip.c:80:17: note: expanded from macro 'qinvvec'
recip.c: #define qinvvec _mm256_set1_epi16(qinv)
recip.c: ^
recip.c: recip.c:98:21: error: always_inline function '_mm256_set1_epi32' requires target feature 'sse4.2', but would be inlined into function 'vectormodq_swapeliminate' that is compiled without support for 'sse4.2'
recip.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:

Compiler	Implementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE	avx

Compiler output

Implementation: jumpdivsteps
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE

avx-768.c: avx-768.c:544:36: error: invalid output size for constraint '+x'
avx-768.c: __asm__("vpsubw %1,%0,%0" : "+x"(a),"+x"(b));
avx-768.c: ^
avx-768.c: avx-768.c:550:36: error: invalid output size for constraint '+x'
avx-768.c: __asm__("vpaddw %1,%0,%0" : "+x"(a),"+x"(b));
avx-768.c: ^
avx-768.c: 2 errors generated.

Number of similar (compiler,implementation) pairs: 1, namely:

Compiler	Implementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE	jumpdivsteps