Implementation notes: aarch64, pi3bplus, crypto_stream/chacha12

Computer: pi3bplus
Architecture: aarch64
CPU ID: 410fd034
SUPERCOP version: 20200702
Operation: crypto_stream
Primitive: chacha12
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
55524436 0 415800 856 808dolbeau/arm-neongcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
55683776 0 413799 848 792dolbeau/arm-neongcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
55963732 0 115805 768 808dolbeau/arm-neonclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020042220200417
63873172 0 412271 832 784dolbeau/arm-neongcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
84704132 0 414223 848 792dolbeau/arm-neongcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
86833076 0 414432 856 808e/mergedgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
87082224 0 412231 848 792e/mergedgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
90211892 0 410999 832 784e/mergedgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
90432772 0 414144 856 808dolbeau/mipsel-msagcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
91072780 0 414144 856 808e/refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
91072780 0 414144 856 808e/regsgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
101452564 0 114621 768 808e/mergedclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020042220200417
105452340 0 114397 768 808e/refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020042220200417
105512340 0 114405 768 808dolbeau/mipsel-msaclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020042220200417
106652412 0 114477 768 808e/regsclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020042220200417
166322012 0 412047 848 792e/regsgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
231113556 0 413639 848 792e/mergedgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
240921664 0 410759 832 784e/refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
241191664 0 410775 832 784dolbeau/mipsel-msagcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
255041824 0 410903 832 784e/regsgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
258482032 0 412063 848 792dolbeau/mipsel-msagcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
258581996 0 412023 848 792e/refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
337452924 0 412999 848 792e/regsgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
399832424 0 412519 848 792dolbeau/mipsel-msagcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417
400142424 0 412503 848 792e/refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020042220200417

Compiler output

Implementation: crypto_stream/chacha12/amd64-ssse3
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
chacha.s: chacha.s:19:5: error: unknown token in expression
chacha.s: mov %rsp,%r11
chacha.s: ^
chacha.s: chacha.s:19:5: error: invalid operand
chacha.s: mov %rsp,%r11
chacha.s: ^
chacha.s: chacha.s:20:5: error: invalid token in expression
chacha.s: and $31,%r11
chacha.s: ^
chacha.s: chacha.s:20:5: error: invalid operand
chacha.s: and $31,%r11
chacha.s: ^
chacha.s: chacha.s:21:5: error: invalid token in expression
chacha.s: add $384,%r11
chacha.s: ^
chacha.s: chacha.s:21:5: error: invalid operand
chacha.s: add $384,%r11
chacha.s: ^
chacha.s: chacha.s:22:5: error: unknown token in expression
chacha.s: sub %r11,%rsp
chacha.s: ^
chacha.s: chacha.s:22:5: error: invalid operand
chacha.s: sub %r11,%rsp
chacha.s: ^
chacha.s: chacha.s:23:6: error: unknown token in expression
chacha.s: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE amd64-ssse3

Compiler output

Implementation: crypto_stream/chacha12/goll_gueron
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
stream.c: In file included from stream.c:11:
stream.c: In file included from /usr/lib/llvm-7/lib/clang/7.0.1/include/immintrin.h:28:
stream.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:64:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
stream.c: return (__m64)__builtin_ia32_vec_init_v2si(__i, 0);
stream.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stream.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:143:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
stream.c: return (__m64)__builtin_ia32_packsswb((__v4hi)__m1, (__v4hi)__m2);
stream.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stream.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:173:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
stream.c: return (__m64)__builtin_ia32_packssdw((__v2si)__m1, (__v2si)__m2);
stream.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stream.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:203:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
stream.c: return (__m64)__builtin_ia32_packuswb((__v4hi)__m1, (__v4hi)__m2);
stream.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stream.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:230:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
stream.c: return (__m64)__builtin_ia32_punpckhbw((__v8qi)__m1, (__v8qi)__m2);
stream.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stream.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:253:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
stream.c: return (__m64)__builtin_ia32_punpckhwd((__v4hi)__m1, (__v4hi)__m2);
stream.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stream.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:274:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
stream.c: return (__m64)__builtin_ia32_punpckhdq((__v2si)__m1, (__v2si)__m2);
stream.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stream.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:301:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
stream.c: return (__m64)__builtin_ia32_punpcklbw((__v8qi)__m1, (__v8qi)__m2);
stream.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE goll_gueron

Compiler output

Implementation: crypto_stream/chacha12/krovetz/avx2
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
stream.c: In file included from stream.c:8:
stream.c: In file included from /usr/lib/llvm-7/lib/clang/7.0.1/include/immintrin.h:28:
stream.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:64:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
stream.c: return (__m64)__builtin_ia32_vec_init_v2si(__i, 0);
stream.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stream.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:143:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
stream.c: return (__m64)__builtin_ia32_packsswb((__v4hi)__m1, (__v4hi)__m2);
stream.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stream.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:173:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
stream.c: return (__m64)__builtin_ia32_packssdw((__v2si)__m1, (__v2si)__m2);
stream.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stream.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:203:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
stream.c: return (__m64)__builtin_ia32_packuswb((__v4hi)__m1, (__v4hi)__m2);
stream.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stream.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:230:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
stream.c: return (__m64)__builtin_ia32_punpckhbw((__v8qi)__m1, (__v8qi)__m2);
stream.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stream.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:253:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
stream.c: return (__m64)__builtin_ia32_punpckhwd((__v4hi)__m1, (__v4hi)__m2);
stream.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stream.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:274:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
stream.c: return (__m64)__builtin_ia32_punpckhdq((__v2si)__m1, (__v2si)__m2);
stream.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stream.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:301:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
stream.c: return (__m64)__builtin_ia32_punpcklbw((__v8qi)__m1, (__v8qi)__m2);
stream.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/avx2

Compiler output

Implementation: crypto_stream/chacha12/krovetz/vec128
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
stream.c: stream.c:80:2: error: -- Implementation supports only machines with neon, altivec or SSE2
stream.c: #error -- Implementation supports only machines with neon, altivec or SSE2
stream.c: ^
stream.c: stream.c:151:14: warning: implicit declaration of function 'NONCE' is invalid in C99 [-Wimplicit-function-declaration]
stream.c: vec s3 = NONCE(np);
stream.c: ^
stream.c: stream.c:151:9: error: initializing 'vec' (vector of 4 'unsigned int' values) with an expression of incompatible type 'int'
stream.c: vec s3 = NONCE(np);
stream.c: ^ ~~~~~~~~~
stream.c: stream.c:152:36: error: use of undeclared identifier 'VBPI'
stream.c: for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: ^
stream.c: stream.c:91:19: note: expanded from macro 'BPI'
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^
stream.c: stream.c:152:36: error: use of undeclared identifier 'GPR_TOO'
stream.c: stream.c:91:26: note: expanded from macro 'BPI'
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^
stream.c: stream.c:155:19: error: use of undeclared identifier 'ONE'
stream.c: v7 = v3 + ONE;
stream.c: ^
stream.c: stream.c:176:13: warning: implicit declaration of function 'ROTW16' is invalid in C99 [-Wimplicit-function-declaration]
stream.c: DQROUND_VECTORS(v0,v1,v2,v3)
stream.c: ^
stream.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/vec128

Compiler output

Implementation: crypto_stream/chacha12/amd64-ssse3
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
chacha.s: chacha.s: Assembler messages:
chacha.s: chacha.s:19: Error: operand 1 must be an integer register -- `mov %rsp,%r11'
chacha.s: chacha.s:20: Error: operand 1 must be an integer or stack pointer register -- `and $31,%r11'
chacha.s: chacha.s:21: Error: operand 1 must be an integer or stack pointer register -- `add $384,%r11'
chacha.s: chacha.s:22: Error: operand 1 must be an integer or stack pointer register -- `sub %r11,%rsp'
chacha.s: chacha.s:23: Error: operand 1 must be an integer register -- `mov %rdi,%r8'
chacha.s: chacha.s:24: Error: operand 1 must be an integer register -- `mov %rsi,%rsi'
chacha.s: chacha.s:25: Error: operand 1 must be an integer register -- `mov %rsi,%rdi'
chacha.s: chacha.s:26: Error: operand 1 must be an integer register -- `mov %rdx,%rdx'
chacha.s: chacha.s:27: Error: operand 1 must be an integer or stack pointer register -- `cmp $0,%rdx'
chacha.s: chacha.s:29: Error: unknown mnemonic `jbe' -- `jbe ._done'
chacha.s: chacha.s:31: Error: operand 1 must be an integer register -- `mov $0,%rax'
chacha.s: chacha.s:33: Error: operand 1 must be an integer register -- `mov %rdx,%rcx'
chacha.s: chacha.s:35: Error: unknown mnemonic `rep' -- `rep stosb'
chacha.s: chacha.s:37: Error: operand 1 must be an integer or stack pointer register -- `sub %rdx,%rdi'
chacha.s: chacha.s:39: Error: unknown mnemonic `jmp' -- `jmp ._start'
chacha.s: chacha.s:47: Error: operand 1 must be an integer register -- `mov %rsp,%r11'
chacha.s: chacha.s:48: Error: operand 1 must be an integer or stack pointer register -- `and $31,%r11'
chacha.s: chacha.s:49: Error: operand 1 must be an integer or stack pointer register -- `add $384,%r11'
chacha.s: chacha.s:50: Error: operand 1 must be an integer or stack pointer register -- `sub %r11,%rsp'
chacha.s: chacha.s:52: Error: operand 1 must be an integer register -- `mov %rdi,%r8'
chacha.s: chacha.s:54: Error: operand 1 must be an integer register -- `mov %rsi,%rsi'
chacha.s: chacha.s:56: Error: operand 1 must be an integer register -- `mov %rdx,%rdi'
chacha.s: chacha.s:58: Error: operand 1 must be an integer register -- `mov %rcx,%rdx'
chacha.s: chacha.s:60: Error: operand 1 must be an integer or stack pointer register -- `cmp $0,%rdx'
chacha.s: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE amd64-ssse3
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE amd64-ssse3
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE amd64-ssse3
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE amd64-ssse3

Compiler output

Implementation: crypto_stream/chacha12/goll_gueron
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
stream.c: stream.c:11:10: fatal error: immintrin.h: No such file or directory
stream.c: #include <immintrin.h>
stream.c: ^~~~~~~~~~~~~
stream.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE goll_gueron
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE goll_gueron
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE goll_gueron
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE goll_gueron

Compiler output

Implementation: crypto_stream/chacha12/krovetz/vec128
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
stream.c: stream.c:80:2: error: #error -- Implementation supports only machines with neon, altivec or SSE2
stream.c: #error -- Implementation supports only machines with neon, altivec or SSE2
stream.c: ^~~~~
stream.c: stream.c: In function 'crypto_stream_chacha12_krovetz_vec128_xor':
stream.c: stream.c:151:14: warning: implicit declaration of function 'NONCE' [-Wimplicit-function-declaration]
stream.c: vec s3 = NONCE(np);
stream.c: ^~~~~
stream.c: stream.c:151:14: error: incompatible types when initializing type 'vec' {aka '__vector(4) unsigned int'} using type 'int'
stream.c: stream.c:91:19: error: 'VBPI' undeclared (first use in this function); did you mean 'BPI'?
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^~~~
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: ^~~
stream.c: stream.c:91:19: note: each undeclared identifier is reported only once for each function it appears in
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^~~~
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: ^~~
stream.c: stream.c:91:26: error: 'GPR_TOO' undeclared (first use in this function)
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^~~~~~~
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/vec128
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/vec128
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/vec128
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/vec128

Compiler output

Implementation: crypto_stream/chacha12/krovetz/avx2
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
stream.c: stream.c:8:10: fatal error: immintrin.h: No such file or directory
stream.c: #include <immintrin.h>
stream.c: ^~~~~~~~~~~~~
stream.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/avx2