Implementation notes: amd64, colossus6, crypto_hashblocks/sha512

Computer: colossus6
Architecture: amd64
CPU ID: AuthenticAMD-00830f10-178bfbff
SUPERCOP version: 20210125
Operation: crypto_hashblocks
Primitive: sha512
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
50185390 0 017776 792 752avx2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
50405390 0 017744 792 752avx2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
50405390 0 017744 792 752avx2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
50405376 0 014878 784 736avx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
50405390 0 017904 792 736avx2clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
50405377 0 015997 768 808avx2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
50405377 0 018214 776 808avx2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
50405330 0 015821 768 808avx2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
50405355 0 014993 752 776avx2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
55135507 0 017856 792 752avxclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
55135459 0 014862 784 736avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
56925507 0 017856 792 752avxclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
56935507 0 017888 792 752avxclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
60755757 0 016125 768 808avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
63005931 0 015457 752 776avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
67285968 0 018734 776 808avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
67505927 0 016485 768 808avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
70882961 0 012366 784 736wflipclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
72002996 0 015344 792 752wflipclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
72222996 0 015376 792 752wflipclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
72232996 0 015344 792 752wflipclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
75385293 0 018110 776 808compactgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
75833295 0 016062 776 808wflipgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
75833415 0 013781 768 808wflipgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
76283500 0 014053 768 808wflipgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
771715165 0 027926 776 808refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
771815164 0 027958 776 808inplacegcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
77403496 0 013025 752 776wflipgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
80774298 0 013857 752 776compactgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
807815444 0 025989 768 808refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
816815452 0 026037 768 808inplacegcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
819015457 0 024985 752 776inplacegcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
82125296 0 017648 792 752compactclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
82805296 0 017648 792 752compactclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
82805108 0 017632 792 736compactclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
828015414 0 024937 752 776refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
83704429 0 015061 768 808compactgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
83935680 0 018064 792 752compactclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
84153026 0 015408 792 752compact4clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
84374442 0 014885 768 808compactgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
85052690 0 015040 792 752compact4clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
85052690 0 015040 792 752compact4clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
85733517 0 016016 792 736wflipclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
86404053 0 013502 784 736compactclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
864016126 0 025558 784 736refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
870716106 0 025542 784 736inplaceclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
875215123 0 025525 768 808inplacegcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
877515042 0 025413 768 808refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
89102697 0 015510 776 808compact4gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
92032020 0 014528 792 736compact4clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
933817852 0 030192 792 752refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
936017860 0 030208 792 752inplaceclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
940517860 0 030208 792 752inplaceclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
940517860 0 030240 792 752inplaceclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
940517852 0 030192 792 752refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
940517852 0 030224 792 752refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
95171383 0 010806 784 736compact4clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
96071399 0 010953 752 776compact4gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
96301546 0 012157 768 808compact4gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
105971487 0 011901 768 808compact4gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
1120518692 0 031200 792 736refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
1125018697 0 031200 792 736inplaceclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
131622611 0 014960 792 752compact3clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
131622916 0 015296 792 752compact3clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
132522611 0 014960 792 752compact3clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
134103006 0 015822 776 808compact2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
135222055 0 014560 792 736compact3clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
137032623 0 015136 792 736compact2clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
139281618 0 011177 752 776compact2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
139951402 0 010838 784 736compact3clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
140401765 0 012397 768 808compact2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
144001763 0 012205 768 808compact2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
149631647 0 012061 768 808compact3gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
151431490 0 011033 752 776compact3gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
153003300 0 015680 792 752compact2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
153453652 0 016064 792 752compact2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
153903300 0 015680 792 752compact2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
153901681 0 011142 784 736compact2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031020210125
153901692 0 012301 768 808compact3gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125
154802737 0 015542 776 808compact3gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2021031020210125

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
inner.c: inner.c:94:14: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_hashblocks_sha512_avx_constbranchindex_inner' that is compiled without support for 'sse4.2'
inner.c: X0 = _mm256_loadu_si256((void *) (in+0));
inner.c: ^
inner.c: inner.c:95:14: error: always_inline function '_mm256_shuffle_epi8' requires target feature 'avx2', but would be inlined into function 'crypto_hashblocks_sha512_avx_constbranchindex_inner' that is compiled without support for 'avx2'
inner.c: X0 = _mm256_shuffle_epi8(X0,bigendian64);
inner.c: ^
inner.c: inner.c:95:37: error: always_inline function '_mm256_set_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_hashblocks_sha512_avx_constbranchindex_inner' that is compiled without support for 'sse4.2'
inner.c: X0 = _mm256_shuffle_epi8(X0,bigendian64);
inner.c: ^
inner.c: inner.c:32:21: note: expanded from macro 'bigendian64'
inner.c: #define bigendian64 _mm256_set_epi8(8,9,10,11,12,13,14,15,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,0,1,2,3,4,5,6,7)
inner.c: ^
inner.c: inner.c:96:14: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_hashblocks_sha512_avx_constbranchindex_inner' that is compiled without support for 'sse4.2'
inner.c: D0 = _mm256_loadu_si256((void *) &constants[0]);
inner.c: ^
inner.c: inner.c:97:14: error: always_inline function '_mm256_add_epi64' requires target feature 'avx2', but would be inlined into function 'crypto_hashblocks_sha512_avx_constbranchindex_inner' that is compiled without support for 'avx2'
inner.c: D0 = _mm256_add_epi64(X0,D0);
inner.c: ^
inner.c: inner.c:102:14: error: always_inline function '_mm256_loadu_si256' requires target feature 'sse4.2', but would be inlined into function 'crypto_hashblocks_sha512_avx_constbranchindex_inner' that is compiled without support for 'sse4.2'
inner.c: X4 = _mm256_loadu_si256((void *) (in+32));
inner.c: ^
inner.c: inner.c:103:14: error: always_inline function '_mm256_shuffle_epi8' requires target feature 'avx2', but would be inlined into function 'crypto_hashblocks_sha512_avx_constbranchindex_inner' that is compiled without support for 'avx2'
inner.c: X4 = _mm256_shuffle_epi8(X4,bigendian64);
inner.c: ^
inner.c: inner.c:103:37: error: always_inline function '_mm256_set_epi8' requires target feature 'sse4.2', but would be inlined into function 'crypto_hashblocks_sha512_avx_constbranchindex_inner' that is compiled without support for 'sse4.2'
inner.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: dolbeau/intelavx2rorxasm
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: crypto_hashblocks_sha512.a(blocks.o): In function `crypto_hashblocks_sha512_dolbeau_intelavx2rorxasm_constbranchindex':
try.c: blocks.c:(.text+0x...): undefined reference to `sha512_rorx'
try.c: clang: error: linker command failed with exit code 1 (use -v to see invocation)

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/intelavx2rorxasm
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/intelavx2rorxasm
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/intelavx2rorxasm
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/intelavx2rorxasm
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/intelavx2rorxasm

Compiler output

Implementation: dolbeau/intelavx2rorxasm
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
try.c: crypto_hashblocks_sha512.a(blocks.o): In function `crypto_hashblocks_sha512_dolbeau_intelavx2rorxasm_constbranchindex':
try.c: blocks.c:(.text+0x...): undefined reference to `sha512_rorx'
try.c: collect2: error: ld returned 1 exit status

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelavx2rorxasm
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelavx2rorxasm
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelavx2rorxasm
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelavx2rorxasm

Compiler output

Implementation: dolbeau/intelavxasm
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: crypto_hashblocks_sha512.a(blocks.o): In function `crypto_hashblocks_sha512_dolbeau_intelavxasm_constbranchindex':
try.c: blocks.c:(.text+0x...): undefined reference to `sha512_avx'
try.c: clang: error: linker command failed with exit code 1 (use -v to see invocation)

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/intelavxasm
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/intelavxasm
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/intelavxasm
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/intelavxasm
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/intelavxasm

Compiler output

Implementation: dolbeau/intelavxasm
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
try.c: crypto_hashblocks_sha512.a(blocks.o): In function `crypto_hashblocks_sha512_dolbeau_intelavxasm_constbranchindex':
try.c: blocks.c:(.text+0x...): undefined reference to `sha512_avx'
try.c: collect2: error: ld returned 1 exit status

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelavxasm
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelavxasm
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelavxasm
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelavxasm

Compiler output

Implementation: dolbeau/intelsse4asm
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: crypto_hashblocks_sha512.a(blocks.o): In function `crypto_hashblocks_sha512_dolbeau_intelsse4asm_constbranchindex':
try.c: blocks.c:(.text+0x...): undefined reference to `sha512_sse4'
try.c: clang: error: linker command failed with exit code 1 (use -v to see invocation)

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/intelsse4asm
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/intelsse4asm
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/intelsse4asm
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/intelsse4asm
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/intelsse4asm

Compiler output

Implementation: dolbeau/intelsse4asm
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
try.c: crypto_hashblocks_sha512.a(blocks.o): In function `crypto_hashblocks_sha512_dolbeau_intelsse4asm_constbranchindex':
try.c: blocks.c:(.text+0x...): undefined reference to `sha512_sse4'
try.c: collect2: error: ld returned 1 exit status

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelsse4asm
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelsse4asm
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelsse4asm
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelsse4asm