Implementation notes: amd64, par, crypto_scalarmult/kummer

Computer: par
Architecture: amd64
CPU ID: GenuineIntel-000406c3-bfebfbff
SUPERCOP version: 20161026
Operation: crypto_scalarmult
Primitive: kummer
TimeImplementationCompilerBenchmark dateSUPERCOP version
2309520ref5gcc -funroll-loops -march=native -mcpu=native -O22016121420161026
2340480ref5gcc -march=native -mcpu=native -O32016121420161026
2348560ref5ugcc -funroll-loops -march=native -mcpu=native -O22016121420161026
2350220ref5ugcc -march=native -mcpu=native -O32016121420161026
2357040ref5gcc -funroll-loops -march=native -mcpu=native -O32016121420161026
2399700ref5ugcc -funroll-loops -march=native -mcpu=native -O32016121420161026
2424580ref5gcc -march=native -mcpu=native -O22016121420161026
2439000ref5ugcc -march=native -mcpu=native -O22016121420161026
2625760ref5gcc -march=native -mcpu=native -Os2016121420161026
2640300ref5ugcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
2670080ref5gcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
2679960ref5ugcc -march=native -mcpu=native -Os2016121420161026

Test failure

Implementation: crypto_scalarmult/kummer/avx
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
error 111

Number of similar (compiler,implementation) pairs: 12, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 avx avx2
gcc -funroll-loops -march=native -mcpu=native -O3 avx avx2
gcc -funroll-loops -march=native -mcpu=native -Os avx avx2
gcc -march=native -mcpu=native -O2 avx avx2
gcc -march=native -mcpu=native -O3 avx avx2
gcc -march=native -mcpu=native -Os avx avx2

Compiler output

Implementation: crypto_scalarmult/kummer/avx2int
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
base.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
gfe.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
smult.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
smult.c: smult.c: In function 'gfe4x_mulconst':
smult.c: smult.c:36:6: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
smult.c: t0 = _mm256_mul_epi32(a->gt;v[0],*b);
smult.c: ~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
smult.c: In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/immintrin.h:43:0,
smult.c: from gfe4x.h:5,
smult.c: from smult.c:3:
smult.c: /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/avx2intrin.h:126:1: error: inlining failed in call to always_inline '_mm256_add_epi64': target specific option mismatch
smult.c: _mm256_add_epi64 (__m256i __A, __m256i __B)
smult.c: ^~~~~~~~~~~~~~~~
smult.c: smult.c:45:8: note: called from here
smult.c: t2 = _mm256_add_epi64(t2,_mm256_srli_epi64(t1,25)); t1 &= mask25;
smult.c: ~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
smult.c: In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/immintrin.h:43:0,
smult.c: from gfe4x.h:5,
smult.c: from smult.c:3:
smult.c: /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/avx2intrin.h:787:1: error: inlining failed in call to always_inline '_mm256_srli_epi64': target specific option mismatch
smult.c: _mm256_srli_epi64 (__m256i __A, int __B)
smult.c: ^~~~~~~~~~~~~~~~~
smult.c: smult.c:45:10: note: called from here
smult.c: t2 = _mm256_add_epi64(t2,_mm256_srli_epi64(t1,25)); t1 &= mask25;
smult.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
smult.c: In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/immintrin.h:43:0,
smult.c: from gfe4x.h:5,
smult.c: ...

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 avx2int
gcc -funroll-loops -march=native -mcpu=native -O3 avx2int
gcc -funroll-loops -march=native -mcpu=native -Os avx2int
gcc -march=native -mcpu=native -O2 avx2int
gcc -march=native -mcpu=native -O3 avx2int
gcc -march=native -mcpu=native -Os avx2int

Compiler output

Implementation: crypto_scalarmult/kummer/ref5
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
base.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
gfe.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
smult.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
measure.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 12, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 ref5 ref5u
gcc -funroll-loops -march=native -mcpu=native -O3 ref5 ref5u
gcc -funroll-loops -march=native -mcpu=native -Os ref5 ref5u
gcc -march=native -mcpu=native -O2 ref5 ref5u
gcc -march=native -mcpu=native -O3 ref5 ref5u
gcc -march=native -mcpu=native -Os ref5 ref5u

Compiler output

Implementation: crypto_scalarmult/kummer/avx
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
consts.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
gfe.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
smult.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
gfe4x3limb_freeze.s: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
gfe_mul.s: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
gfe_nsquare.s: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
ladder.s: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
znegate.s: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 12, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 avx avx2
gcc -funroll-loops -march=native -mcpu=native -O3 avx avx2
gcc -funroll-loops -march=native -mcpu=native -Os avx avx2
gcc -march=native -mcpu=native -O2 avx avx2
gcc -march=native -mcpu=native -O3 avx avx2
gcc -march=native -mcpu=native -Os avx avx2