Implementation notes: amd64, par, crypto_hash/keccakc256

Computer: par
Architecture: amd64
CPU ID: GenuineIntel-000406c3-bfebfbff
SUPERCOP version: 20161026
Operation: crypto_hash
Primitive: keccakc256
TimeImplementationCompilerBenchmark dateSUPERCOP version
31440opt64lcu24gcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
31620opt64lcu24gcc -funroll-loops -march=native -mcpu=native -O22016121420161026
31620opt64lcu24gcc -funroll-loops -march=native -mcpu=native -O32016121420161026
31720x86_64_asmgcc -march=native -mcpu=native -O22016121420161026
31760x86_64_asmgcc -funroll-loops -march=native -mcpu=native -O22016121420161026
31760x86_64_asmgcc -funroll-loops -march=native -mcpu=native -O32016121420161026
31780x86_64_asmgcc -march=native -mcpu=native -O32016121420161026
31820opt64lcu24gcc -march=native -mcpu=native -Os2016121420161026
31900x86_64_asmgcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
31900x86_64_asmgcc -march=native -mcpu=native -Os2016121420161026
31980opt64lcu6gcc -funroll-loops -march=native -mcpu=native -O22016121420161026
32100opt64lcu6gcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
32180opt64lcu6gcc -funroll-loops -march=native -mcpu=native -O32016121420161026
32360opt64lcu6gcc -march=native -mcpu=native -Os2016121420161026
33140opt64lcu24gcc -march=native -mcpu=native -O32016121420161026
33280opt64lcu6gcc -march=native -mcpu=native -O32016121420161026
33320opt64lcu24gcc -march=native -mcpu=native -O22016121420161026
33500opt64lcu6gcc -march=native -mcpu=native -O22016121420161026
34560opt64u6gcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
34900opt64u6gcc -march=native -mcpu=native -Os2016121420161026
35260inplacegcc -march=native -mcpu=native -Os2016121420161026
35280inplacegcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
35340opt64u6gcc -funroll-loops -march=native -mcpu=native -O22016121420161026
35400simplegcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
35600simplegcc -march=native -mcpu=native -Os2016121420161026
35740opt64u6gcc -funroll-loops -march=native -mcpu=native -O32016121420161026
35960simplegcc -funroll-loops -march=native -mcpu=native -O22016121420161026
36200inplacegcc -funroll-loops -march=native -mcpu=native -O32016121420161026
36220inplacegcc -funroll-loops -march=native -mcpu=native -O22016121420161026
36280simplegcc -march=native -mcpu=native -O22016121420161026
36420opt64u6gcc -march=native -mcpu=native -O22016121420161026
36720inplacegcc -march=native -mcpu=native -O32016121420161026
37260simplegcc -funroll-loops -march=native -mcpu=native -O32016121420161026
37560opt64u6gcc -march=native -mcpu=native -O32016121420161026
37700inplacegcc -march=native -mcpu=native -O22016121420161026
38120simplegcc -march=native -mcpu=native -O32016121420161026
46060sseu2gcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
46520sseu2gcc -march=native -mcpu=native -Os2016121420161026
46920sseu2gcc -funroll-loops -march=native -mcpu=native -O32016121420161026
48040sseu2gcc -funroll-loops -march=native -mcpu=native -O22016121420161026
48960sseu2gcc -march=native -mcpu=native -O32016121420161026
49620sseu2gcc -march=native -mcpu=native -O22016121420161026
51920mmxu1gcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
52600mmxu1gcc -funroll-loops -march=native -mcpu=native -O22016121420161026
53500mmxu1gcc -march=native -mcpu=native -Os2016121420161026
54740mmxu1gcc -funroll-loops -march=native -mcpu=native -O32016121420161026
58140compactgcc -funroll-loops -march=native -mcpu=native -O22016121420161026
58380mmxu1gcc -march=native -mcpu=native -O32016121420161026
58540mmxu1gcc -march=native -mcpu=native -O22016121420161026
62880compactgcc -funroll-loops -march=native -mcpu=native -O32016121420161026
70060opt32bi-s2lcu4gcc -funroll-loops -march=native -mcpu=native -O32016121420161026
73860opt32bi-s2lcu4gcc -march=native -mcpu=native -O32016121420161026
74480opt32biT-s2lcu4gcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
74480opt32biT-s2lcu4gcc -march=native -mcpu=native -Os2016121420161026
75520opt32biT-s2lcu4gcc -funroll-loops -march=native -mcpu=native -O32016121420161026
76520opt32biT-s2lcu4gcc -funroll-loops -march=native -mcpu=native -O22016121420161026
76760compactgcc -march=native -mcpu=native -O32016121420161026
76840opt32bi-s2lcu4gcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
77400opt32bi-s2lcu4gcc -march=native -mcpu=native -Os2016121420161026
78080opt32biT-s2lcu4gcc -march=native -mcpu=native -O32016121420161026
78700opt32bi-s2lcu4gcc -funroll-loops -march=native -mcpu=native -O22016121420161026
79760opt32biT-s2lcu4gcc -march=native -mcpu=native -O22016121420161026
81060simple32bigcc -funroll-loops -march=native -mcpu=native -O32016121420161026
81680opt32bi-rvku2gcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
81920simple32bigcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
82440opt32bi-rvku2gcc -march=native -mcpu=native -Os2016121420161026
82480opt32bi-rvku2gcc -funroll-loops -march=native -mcpu=native -O32016121420161026
82520opt32bi-s2lcu4gcc -march=native -mcpu=native -O22016121420161026
83420inplace32bigcc -funroll-loops -march=native -mcpu=native -O32016121420161026
84340simple32bigcc -march=native -mcpu=native -Os2016121420161026
84700inplace32bigcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
85160simple32bigcc -march=native -mcpu=native -O32016121420161026
85460opt32bi-rvku2gcc -march=native -mcpu=native -O32016121420161026
85560opt32bi-rvku2gcc -funroll-loops -march=native -mcpu=native -O22016121420161026
86180inplace32bigcc -march=native -mcpu=native -Os2016121420161026
87060inplace32bigcc -march=native -mcpu=native -O32016121420161026
88580opt32bi-rvku2gcc -march=native -mcpu=native -O22016121420161026
88740simple32bigcc -funroll-loops -march=native -mcpu=native -O22016121420161026
92000simple32bigcc -march=native -mcpu=native -O22016121420161026
92440inplace32bigcc -funroll-loops -march=native -mcpu=native -O22016121420161026
96380inplace32bigcc -march=native -mcpu=native -O22016121420161026
100100opt64lcu24shldgcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
100600opt64lcu24shldgcc -funroll-loops -march=native -mcpu=native -O32016121420161026
100640opt64lcu24shldgcc -funroll-loops -march=native -mcpu=native -O22016121420161026
100680x86_64_shldgcc -funroll-loops -march=native -mcpu=native -O22016121420161026
100680x86_64_shldgcc -funroll-loops -march=native -mcpu=native -O32016121420161026
100720x86_64_shldgcc -march=native -mcpu=native -O22016121420161026
100740x86_64_shldgcc -march=native -mcpu=native -O32016121420161026
100820x86_64_shldgcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
100840x86_64_shldgcc -march=native -mcpu=native -Os2016121420161026
101400opt64lcu24shldgcc -march=native -mcpu=native -Os2016121420161026
101720opt64lcu24shldgcc -march=native -mcpu=native -O32016121420161026
101740opt64lcu24shldgcc -march=native -mcpu=native -O22016121420161026
144660compactgcc -march=native -mcpu=native -Os2016121420161026
144740compactgcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
147960compactgcc -march=native -mcpu=native -O22016121420161026
221700compact8gcc -funroll-loops -march=native -mcpu=native -O32016121420161026
234900compact8gcc -march=native -mcpu=native -O32016121420161026
257000compact8gcc -funroll-loops -march=native -mcpu=native -O22016121420161026
295480compact8gcc -march=native -mcpu=native -O22016121420161026
331420compact8gcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
337680compact8gcc -march=native -mcpu=native -Os2016121420161026

Compiler output

Implementation: crypto_hash/keccakc256/compact
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
Keccak-compact.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
measure.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 compact
gcc -funroll-loops -march=native -mcpu=native -O3 compact
gcc -funroll-loops -march=native -mcpu=native -Os compact
gcc -march=native -mcpu=native -O2 compact
gcc -march=native -mcpu=native -O3 compact
gcc -march=native -mcpu=native -Os compact

Compiler output

Implementation: crypto_hash/keccakc256/compact8
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
Keccak-compact8.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
measure.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 compact8
gcc -funroll-loops -march=native -mcpu=native -O3 compact8
gcc -funroll-loops -march=native -mcpu=native -Os compact8
gcc -march=native -mcpu=native -O2 compact8
gcc -march=native -mcpu=native -O3 compact8
gcc -march=native -mcpu=native -Os compact8

Compiler output

Implementation: crypto_hash/keccakc256/inplace
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
Keccak-inplace.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
measure.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 inplace
gcc -funroll-loops -march=native -mcpu=native -O3 inplace
gcc -funroll-loops -march=native -mcpu=native -Os inplace
gcc -march=native -mcpu=native -O2 inplace
gcc -march=native -mcpu=native -O3 inplace
gcc -march=native -mcpu=native -Os inplace

Compiler output

Implementation: crypto_hash/keccakc256/inplace32bi
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
Keccak-inplace32BI.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
measure.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 inplace32bi
gcc -funroll-loops -march=native -mcpu=native -O3 inplace32bi
gcc -funroll-loops -march=native -mcpu=native -Os inplace32bi
gcc -march=native -mcpu=native -O2 inplace32bi
gcc -march=native -mcpu=native -O3 inplace32bi
gcc -march=native -mcpu=native -Os inplace32bi

Compiler output

Implementation: crypto_hash/keccakc256/simple
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
Keccak-simple.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
measure.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 simple
gcc -funroll-loops -march=native -mcpu=native -O3 simple
gcc -funroll-loops -march=native -mcpu=native -Os simple
gcc -march=native -mcpu=native -O2 simple
gcc -march=native -mcpu=native -O3 simple
gcc -march=native -mcpu=native -Os simple

Compiler output

Implementation: crypto_hash/keccakc256/simple32bi
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
Keccak-simple32BI.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
measure.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 simple32bi
gcc -funroll-loops -march=native -mcpu=native -O3 simple32bi
gcc -funroll-loops -march=native -mcpu=native -Os simple32bi
gcc -march=native -mcpu=native -O2 simple32bi
gcc -march=native -mcpu=native -O3 simple32bi
gcc -march=native -mcpu=native -Os simple32bi

Compiler output

Implementation: crypto_hash/keccakc256/opt32bi-rvku2
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
KeccakF-1600-opt32.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
KeccakSponge.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
hash.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
measure.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 18, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 opt32bi-rvku2 opt32bi-s2lcu4 opt32biT-s2lcu4
gcc -funroll-loops -march=native -mcpu=native -O3 opt32bi-rvku2 opt32bi-s2lcu4 opt32biT-s2lcu4
gcc -funroll-loops -march=native -mcpu=native -Os opt32bi-rvku2 opt32bi-s2lcu4 opt32biT-s2lcu4
gcc -march=native -mcpu=native -O2 opt32bi-rvku2 opt32bi-s2lcu4 opt32biT-s2lcu4
gcc -march=native -mcpu=native -O3 opt32bi-rvku2 opt32bi-s2lcu4 opt32biT-s2lcu4
gcc -march=native -mcpu=native -Os opt32bi-rvku2 opt32bi-s2lcu4 opt32biT-s2lcu4

Compiler output

Implementation: crypto_hash/keccakc256/xopu24
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
KeccakF-1600-opt64.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
KeccakF-1600-opt64.c: In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/x86intrin.h:54:0,
KeccakF-1600-opt64.c: from KeccakF-1600-opt64.c:74:
KeccakF-1600-opt64.c: KeccakF-1600-opt64.c: In function 'KeccakPermutationOnWords':
KeccakF-1600-opt64.c: /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/xopintrin.h:266:1: error: inlining failed in call to always_inline '_mm_roti_epi64': target specific option mismatch
KeccakF-1600-opt64.c: _mm_roti_epi64(__m128i __A, const int __B)
KeccakF-1600-opt64.c: ^~~~~~~~~~~~~~
KeccakF-1600-opt64.c: In file included from KeccakF-1600-opt64.c:130:0:
KeccakF-1600-opt64.c: KeccakF-1600-xop.macros:103:11: note: called from here
KeccakF-1600-opt64.c: Bsusa = ROL6464same(Bsusa, 2); \
KeccakF-1600-opt64.c:
KeccakF-1600-opt64.c: KeccakF-1600-xop.macros:123:36: note: in expansion of macro 'thetaRhoPiChiIotaPrepareTheta'
KeccakF-1600-opt64.c: #define thetaRhoPiChiIota(i, A, E) thetaRhoPiChiIotaPrepareTheta(i, A, E)
KeccakF-1600-opt64.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
KeccakF-1600-opt64.c: KeccakF-1600-unrolling.macros:40:5: note: in expansion of macro 'thetaRhoPiChiIota'
KeccakF-1600-opt64.c: thetaRhoPiChiIota(23, E, A) \
KeccakF-1600-opt64.c: ^~~~~~~~~~~~~~~~~
KeccakF-1600-opt64.c: KeccakF-1600-opt64.c:185:5: note: in expansion of macro 'rounds'
KeccakF-1600-opt64.c: rounds
KeccakF-1600-opt64.c: ^~~~~~
KeccakF-1600-opt64.c: In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/x86intrin.h:54:0,
KeccakF-1600-opt64.c: from KeccakF-1600-opt64.c:74:
KeccakF-1600-opt64.c: /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/xopintrin.h:239:1: error: inlining failed in call to always_inline '_mm_rot_epi64': target specific option mismatch
KeccakF-1600-opt64.c: _mm_rot_epi64(__m128i __A, __m128i __B)
KeccakF-1600-opt64.c: ^~~~~~~~~~~~~
KeccakF-1600-opt64.c: ...

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 xopu24
gcc -funroll-loops -march=native -mcpu=native -O3 xopu24
gcc -funroll-loops -march=native -mcpu=native -Os xopu24
gcc -march=native -mcpu=native -O2 xopu24
gcc -march=native -mcpu=native -O3 xopu24
gcc -march=native -mcpu=native -Os xopu24

Compiler output

Implementation: crypto_hash/keccakc256/mmxu1
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
KeccakF-1600-opt64.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
KeccakSponge.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
hash.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
measure.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 36, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 mmxu1 opt64lcu24 opt64lcu24shld opt64lcu6 opt64u6 sseu2
gcc -funroll-loops -march=native -mcpu=native -O3 mmxu1 opt64lcu24 opt64lcu24shld opt64lcu6 opt64u6 sseu2
gcc -funroll-loops -march=native -mcpu=native -Os mmxu1 opt64lcu24 opt64lcu24shld opt64lcu6 opt64u6 sseu2
gcc -march=native -mcpu=native -O2 mmxu1 opt64lcu24 opt64lcu24shld opt64lcu6 opt64u6 sseu2
gcc -march=native -mcpu=native -O3 mmxu1 opt64lcu24 opt64lcu24shld opt64lcu6 opt64u6 sseu2
gcc -march=native -mcpu=native -Os mmxu1 opt64lcu24 opt64lcu24shld opt64lcu6 opt64u6 sseu2

Compiler output

Implementation: crypto_hash/keccakc256/x86_64_asm
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
KeccakF-1600-x86-64-asm.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
KeccakSponge.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
hash.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
KeccakF-1600-x86-64-gas.s: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
measure.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 x86_64_asm
gcc -funroll-loops -march=native -mcpu=native -O3 x86_64_asm
gcc -funroll-loops -march=native -mcpu=native -Os x86_64_asm
gcc -march=native -mcpu=native -O2 x86_64_asm
gcc -march=native -mcpu=native -O3 x86_64_asm
gcc -march=native -mcpu=native -Os x86_64_asm

Compiler output

Implementation: crypto_hash/keccakc256/x86_64_shld
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
KeccakF-1600-x86-64-asm.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
KeccakSponge.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
hash.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
KeccakF-1600-x86-64-shld-gas.s: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
measure.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 x86_64_shld
gcc -funroll-loops -march=native -mcpu=native -O3 x86_64_shld
gcc -funroll-loops -march=native -mcpu=native -Os x86_64_shld
gcc -march=native -mcpu=native -O2 x86_64_shld
gcc -march=native -mcpu=native -O3 x86_64_shld
gcc -march=native -mcpu=native -Os x86_64_shld