Implementation notes: amd64, skylake, crypto_dh/surf2113

Computer: skylake
Architecture: amd64
CPU ID: GenuineIntel-000506e3-bfebfbff
SUPERCOP version: 20161026
Operation: crypto_dh
Primitive: surf2113
TimeImplementationCompilerBenchmark dateSUPERCOP version
2210028mpfqgcc -m64 -march=k8 -O -fomit-frame-pointer2016121720161026
2210266mpfqgcc -m64 -march=barcelona -O -fomit-frame-pointer2016121720161026
2211760mpfqgcc -m64 -march=core2 -msse4.1 -O -fomit-frame-pointer2016121720161026
2214000mpfqgcc -fno-schedule-insns -O -fomit-frame-pointer2016121720161026
2218740mpfqgcc -O -fomit-frame-pointer2016121720161026
2220142mpfqgcc -march=k8 -O -fomit-frame-pointer2016121720161026
2226618mpfqgcc -funroll-loops -m64 -march=nocona -O -fomit-frame-pointer2016121720161026
2227104mpfqgcc -m64 -march=nocona -O -fomit-frame-pointer2016121720161026
2227964mpfqgcc -funroll-loops -m64 -march=barcelona -O -fomit-frame-pointer2016121720161026
2228168mpfqgcc -funroll-loops -march=nocona -O -fomit-frame-pointer2016121720161026
2232426mpfqgcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer2016121720161026
2240172mpfqgcc -m64 -march=core2 -O -fomit-frame-pointer2016121720161026
2241336mpfqgcc -funroll-loops -O -fomit-frame-pointer2016121720161026
2242670mpfqgcc -m64 -march=core2 -msse4 -O -fomit-frame-pointer2016121720161026
2247626mpfqgcc -march=nocona -O -fomit-frame-pointer2016121720161026
2248476mpfqgcc -march=barcelona -O -fomit-frame-pointer2016121720161026
2258000mpfqgcc -funroll-loops -march=barcelona -O -fomit-frame-pointer2016121720161026
2259690mpfqgcc -funroll-loops -m64 -O -fomit-frame-pointer2016121720161026
2261646mpfqgcc -m64 -O -fomit-frame-pointer2016121720161026
2262860mpfqgcc -funroll-loops -march=k8 -O -fomit-frame-pointer2016121720161026
2272480mpfqgcc -funroll-loops -m64 -march=k8 -O -fomit-frame-pointer2016121720161026
2278346mpfqgcc -m64 -march=barcelona -O2 -fomit-frame-pointer2016121720161026
2288440mpfqgcc -m64 -march=core2 -msse4 -O2 -fomit-frame-pointer2016121720161026
2288464mpfqgcc -m64 -march=core2 -O2 -fomit-frame-pointer2016121720161026
2290616mpfqgcc -fno-schedule-insns -O2 -fomit-frame-pointer2016121720161026
2291556mpfqgcc -m64 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2016121720161026
2297158mpfqgcc -march=barcelona -O2 -fomit-frame-pointer2016121720161026
2298950mpfqgcc -m64 -O2 -fomit-frame-pointer2016121720161026
2301980mpfqgcc -march=k8 -O2 -fomit-frame-pointer2016121720161026
2303404mpfqgcc -m64 -march=native -mtune=native -Os -fomit-frame-pointer2016121720161026
2307474mpfqgcc -march=nocona -O2 -fomit-frame-pointer2016121720161026
2312324mpfqgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016121720161026
2314838mpfqgcc -O2 -fomit-frame-pointer2016121720161026
2315664mpfqgcc -m64 -march=core-avx2 -Os -fomit-frame-pointer2016121720161026
2322530mpfqgcc -m64 -march=corei7-avx -Os -fomit-frame-pointer2016121720161026
2323842mpfqgcc -m64 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2016121720161026
2325978mpfqgcc -m64 -march=core2 -O3 -fomit-frame-pointer2016121720161026
2329414mpfqgcc -m64 -march=nocona -O2 -fomit-frame-pointer2016121720161026
2338732mpfqgcc -m64 -march=core2 -msse4 -O3 -fomit-frame-pointer2016121720161026
2338908mpfqgcc -m64 -march=core-avx-i -Os -fomit-frame-pointer2016121720161026
2341966mpfqgcc -m64 -march=barcelona -O3 -fomit-frame-pointer2016121720161026
2344772mpfqgcc -march=barcelona -O3 -fomit-frame-pointer2016121720161026
2345568mpfqgcc -m64 -march=k8 -O2 -fomit-frame-pointer2016121720161026
2346442mpfqgcc -funroll-loops -m64 -march=barcelona -O3 -fomit-frame-pointer2016121720161026
2350468mpfqgcc -funroll-loops -m64 -march=k8 -O2 -fomit-frame-pointer2016121720161026
2361498mpfqgcc -funroll-loops -march=k8 -O3 -fomit-frame-pointer2016121720161026
2362124mpfqgcc -m64 -O3 -fomit-frame-pointer2016121720161026
2362330mpfqgcc -funroll-loops -m64 -march=k8 -O3 -fomit-frame-pointer2016121720161026
2363422mpfqgcc -march=k8 -O3 -fomit-frame-pointer2016121720161026
2363542mpfqgcc -funroll-loops -march=barcelona -O3 -fomit-frame-pointer2016121720161026
2369578mpfqgcc -funroll-loops -m64 -march=barcelona -Os -fomit-frame-pointer2016121720161026
2372006mpfqgcc -funroll-loops -march=k8 -O2 -fomit-frame-pointer2016121720161026
2373116mpfqgcc -fno-schedule-insns -O3 -fomit-frame-pointer2016121720161026
2376464mpfqgcc -funroll-loops -march=barcelona -Os -fomit-frame-pointer2016121720161026
2378386mpfqgcc -funroll-loops -march=nocona -O2 -fomit-frame-pointer2016121720161026
2383702mpfqgcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer2016121720161026
2386706mpfqgcc -funroll-loops -march=k8 -Os -fomit-frame-pointer2016121720161026
2387830mpfqgcc -march=nocona -O3 -fomit-frame-pointer2016121720161026
2395560mpfqgcc -O3 -fomit-frame-pointer2016121720161026
2396906mpfqgcc -funroll-loops -m64 -Os -fomit-frame-pointer2016121720161026
2397876mpfqgcc -funroll-loops -Os -fomit-frame-pointer2016121720161026
2399308mpfqgcc -funroll-loops -m64 -march=k8 -Os -fomit-frame-pointer2016121720161026
2399888mpfqgcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer2016121720161026
2407682mpfqgcc -funroll-loops -O2 -fomit-frame-pointer2016121720161026
2408192mpfqgcc -funroll-loops -m64 -march=nocona -O2 -fomit-frame-pointer2016121720161026
2414716mpfqgcc -funroll-loops -march=barcelona -O2 -fomit-frame-pointer2016121720161026
2416838mpfqgcc -funroll-loops -O3 -fomit-frame-pointer2016121720161026
2420320mpfqgcc -funroll-loops -m64 -O3 -fomit-frame-pointer2016121720161026
2421416mpfqgcc -m64 -march=nocona -O3 -fomit-frame-pointer2016121720161026
2427174mpfqgcc -funroll-loops -march=nocona -O3 -fomit-frame-pointer2016121720161026
2428892mpfqgcc -funroll-loops -m64 -march=barcelona -O2 -fomit-frame-pointer2016121720161026
2432160mpfqgcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer2016121720161026
2433442mpfqgcc -funroll-loops -m64 -O2 -fomit-frame-pointer2016121720161026
2455570mpfqgcc -funroll-loops -m64 -march=nocona -O3 -fomit-frame-pointer2016121720161026
2460146mpfqgcc -m64 -march=k8 -O3 -fomit-frame-pointer2016121720161026
2487448mpfqgcc -funroll-loops -m64 -march=nocona -Os -fomit-frame-pointer2016121720161026
2487586mpfqgcc -funroll-loops -march=nocona -Os -fomit-frame-pointer2016121720161026
2513682mpfqgcc -m64 -march=core2 -Os -fomit-frame-pointer2016121720161026
2521778mpfqgcc -m64 -march=core2 -msse4 -Os -fomit-frame-pointer2016121720161026
2531520mpfqgcc -m64 -march=core2 -msse4.1 -Os -fomit-frame-pointer2016121720161026
2532488mpfqgcc -march=k8 -Os -fomit-frame-pointer2016121720161026
2532514mpfqgcc -m64 -Os -fomit-frame-pointer2016121720161026
2535934mpfqgcc -march=barcelona -Os -fomit-frame-pointer2016121720161026
2538360mpfqgcc -m64 -march=barcelona -Os -fomit-frame-pointer2016121720161026
2538890mpfqgcc -fno-schedule-insns -Os -fomit-frame-pointer2016121720161026
2545866mpfqgcc -m64 -march=k8 -Os -fomit-frame-pointer2016121720161026
2547632mpfqgcc -march=nocona -Os -fomit-frame-pointer2016121720161026
2547808mpfqgcc -m64 -march=nocona -Os -fomit-frame-pointer2016121720161026
2550472mpfqgcc -Os -fomit-frame-pointer2016121720161026
2659708mpfqgcc -m64 -march=core-avx-i -O2 -fomit-frame-pointer2016121720161026
2694018mpfqgcc -m64 -march=core-avx2 -O2 -fomit-frame-pointer2016121720161026
2703050mpfqgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016121720161026
2707908mpfqgcc -m64 -march=core-avx-i -O3 -fomit-frame-pointer2016121720161026
2712608mpfqgcc -m64 -march=corei7-avx -O3 -fomit-frame-pointer2016121720161026
2717742mpfqgcc -m64 -march=core-avx2 -O3 -fomit-frame-pointer2016121720161026
2728702mpfqgcc -m64 -march=native -mtune=native -O2 -fomit-frame-pointer2016121720161026
2731074mpfqgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016121720161026
2733480mpfqgcc -m64 -march=native -mtune=native -O3 -fomit-frame-pointer2016121720161026
2733902mpfqgcc -m64 -march=corei7-avx -O2 -fomit-frame-pointer2016121720161026
2757182mpfqgcc -m64 -march=core-avx-i -O -fomit-frame-pointer2016121720161026
2759182mpfqgcc -m64 -march=corei7-avx -O -fomit-frame-pointer2016121720161026
2762064mpfqgcc -m64 -march=core-avx2 -O -fomit-frame-pointer2016121720161026
2789406mpfqgcc -m64 -march=native -mtune=native -O -fomit-frame-pointer2016121720161026
2836132mpfqgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016121720161026
2868356mpfqgcc -m64 -march=corei7 -Os -fomit-frame-pointer2016121720161026
3095066mpfqgcc -m64 -march=corei7 -O3 -fomit-frame-pointer2016121720161026
3149584mpfqgcc -m64 -march=corei7 -O2 -fomit-frame-pointer2016121720161026
3181578mpfqgcc -m64 -march=corei7 -O -fomit-frame-pointer2016121720161026
7492082mpfqcc2016121720161026
7558046mpfqgcc -funroll-loops2016121720161026
7574394mpfqgcc2016121720161026

Compiler output

Implementation: crypto_dh/surf2113/mpfq
Compiler: clang -O3 -fomit-frame-pointer -Qunused-arguments
Surf2_113.c: In file included from Surf2_113.c:11:
Surf2_113.c: In file included from ./field.h:1:
Surf2_113.c: In file included from ./mpfq_2_113.h:5:
Surf2_113.c: ./x86_64/mpfq_2_113.h:714:14: error: use of unknown builtin '__builtin_ia32_pslldqi128' [-Wimplicit-function-declaration]
Surf2_113.c: r.s = t0 ^ SHLD(t1, 64);
Surf2_113.c: ^
Surf2_113.c: ./x86_64/mpfq_2_113.h:571:25: note: expanded from macro 'SHLD'
Surf2_113.c: #define SHLD(x,r) (v2di)__builtin_ia32_pslldqi128 ((gcc43bugfix) (x),(r))
Surf2_113.c: ^
Surf2_113.c: ./x86_64/mpfq_2_113.h:714:14: error: invalid conversion between vector type 'v2di' (vector of 2 'uint64_t' values) and integer type 'int' of different size
Surf2_113.c: r.s = t0 ^ SHLD(t1, 64);
Surf2_113.c: ^~~~~~~~~~~~
Surf2_113.c: ./x86_64/mpfq_2_113.h:571:19: note: expanded from macro 'SHLD'
Surf2_113.c: #define SHLD(x,r) (v2di)__builtin_ia32_pslldqi128 ((gcc43bugfix) (x),(r))
Surf2_113.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Surf2_113.c: ./x86_64/mpfq_2_113.h:721:14: error: use of unknown builtin '__builtin_ia32_psrldqi128' [-Wimplicit-function-declaration]
Surf2_113.c: r.s = t2 ^ SHRD(t1, 64);
Surf2_113.c: ^
Surf2_113.c: ./x86_64/mpfq_2_113.h:572:25: note: expanded from macro 'SHRD'
Surf2_113.c: #define SHRD(x,r) (v2di)__builtin_ia32_psrldqi128 ((gcc43bugfix) (x),(r))
Surf2_113.c: ^
Surf2_113.c: ./x86_64/mpfq_2_113.h:721:14: error: invalid conversion between vector type 'v2di' (vector of 2 'uint64_t' values) and integer type 'int' of different size
Surf2_113.c: r.s = t2 ^ SHRD(t1, 64);
Surf2_113.c: ^~~~~~~~~~~~
Surf2_113.c: ./x86_64/mpfq_2_113.h:572:19: note: expanded from macro 'SHRD'
Surf2_113.c: ...

Number of similar (compiler,implementation) pairs: 10, namely:
CompilerImplementations
clang -O3 -fomit-frame-pointer -Qunused-arguments mpfq
clang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments mpfq
clang -O3 -fwrapv -march=x86-64 -mcpu=core-avx2 -mavx2 -maes -mpclmul -fomit-frame-pointer -Qunused-arguments mpfq
clang -O3 -fwrapv -mavx2 -fomit-frame-pointer -Qunused-arguments mpfq
clang -O3 -fwrapv -mavx -fomit-frame-pointer -Qunused-arguments mpfq
clang -O3 -fwrapv -mavx -maes -mpclmul -fomit-frame-pointer -Qunused-arguments mpfq
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments mpfq
clang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments mpfq
clang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments mpfq
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments mpfq