Implementation notes: aarch64, supercoplxc, crypto_stream/chacha20

Computer: supercoplxc
Architecture: aarch64
CPU ID: 410fd034
SUPERCOP version: 20190816
Operation: crypto_stream
Primitive: chacha20
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
106404388 0 417489 912 824dolbeau/arm-neongcc_-O3_-fomit-frame-pointer2019090220190816
106405288 0 419120 904 808dolbeau/arm-neongcc_-funroll-loops_-O2_-fomit-frame-pointer2019090220190816
106405744 0 420233 912 824dolbeau/arm-neongcc_-funroll-loops_-O3_-fomit-frame-pointer2019090220190816
107203792 0 117764 800 840dolbeau/arm-neonclang_-O3_-fomit-frame-pointer_-Qunused-arguments2019090220190816
107203808 0 117836 800 840dolbeau/arm-neonclang_-O3_-fwrapv_-mavx2_-fomit-frame-pointer_-Qunused-arguments2019090220190816
107203808 0 117836 800 840dolbeau/arm-neonclang_-O3_-fwrapv_-mavx_-maes_-mpclmul_-fomit-frame-pointer_-Qunused-arguments2019090220190816
107203732 0 117748 800 840dolbeau/arm-neonclang_-mcpu=native_-mfpu=neon_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments2019090220190816
107203712 0 415464 904 808dolbeau/arm-neongcc_-O2_-fomit-frame-pointer2019090220190816
108003808 0 117836 800 840dolbeau/arm-neonclang_-O3_-fwrapv_-mavx_-fomit-frame-pointer_-Qunused-arguments2019090220190816
124803172 0 414086 888 800dolbeau/arm-neongcc_-Os_-fomit-frame-pointer2019090220190816
124803172 0 414086 888 800dolbeau/arm-neongcc_-fno-schedule-insns_-Os_-fomit-frame-pointer2019090220190816
125603168 0 414142 888 800dolbeau/arm-neongcc_-funroll-loops_-Os_-fomit-frame-pointer2019090220190816
125603168 0 414142 888 800dolbeau/arm-neongcc_-funroll-loops_-fno-schedule-insns_-Os_-fomit-frame-pointer2019090220190816
128805284 0 419176 904 808dolbeau/arm-neongcc_-funroll-loops_-fno-schedule-insns_-O2_-fomit-frame-pointer2019090220190816
128805820 0 420233 912 824dolbeau/arm-neongcc_-funroll-loops_-fno-schedule-insns_-O3_-fomit-frame-pointer2019090220190816
135203632 0 415312 904 808dolbeau/arm-neongcc_-fno-schedule-insns_-O2_-fomit-frame-pointer2019090220190816
135204452 0 417505 912 824dolbeau/arm-neongcc_-fno-schedule-insns_-O3_-fomit-frame-pointer2019090220190816
136803980 0 117972 800 840dolbeau/generic-gccsimd128clang_-O3_-fomit-frame-pointer_-Qunused-arguments2019090220190816
136803992 0 118036 800 840dolbeau/generic-gccsimd128clang_-O3_-fwrapv_-mavx2_-fomit-frame-pointer_-Qunused-arguments2019090220190816
136803992 0 118036 800 840dolbeau/generic-gccsimd128clang_-O3_-fwrapv_-mavx_-fomit-frame-pointer_-Qunused-arguments2019090220190816
136803992 0 118036 800 840dolbeau/generic-gccsimd128clang_-O3_-fwrapv_-mavx_-maes_-mpclmul_-fomit-frame-pointer_-Qunused-arguments2019090220190816
136803900 0 117932 800 840dolbeau/generic-gccsimd128clang_-mcpu=native_-mfpu=neon_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments2019090220190816
136803856 0 415624 904 808dolbeau/generic-gccsimd128gcc_-O2_-fomit-frame-pointer2019090220190816
136804524 0 417641 912 824dolbeau/generic-gccsimd128gcc_-O3_-fomit-frame-pointer2019090220190816
136805432 0 419280 904 808dolbeau/generic-gccsimd128gcc_-funroll-loops_-O2_-fomit-frame-pointer2019090220190816
136805904 0 420409 912 824dolbeau/generic-gccsimd128gcc_-funroll-loops_-O3_-fomit-frame-pointer2019090220190816
162406196 0 421328 904 808dolbeau/arm-neongcc_-funroll-loops_-O_-fomit-frame-pointer2019090220190816
162406196 0 421328 904 808dolbeau/arm-neongcc_-funroll-loops_-fno-schedule-insns_-O_-fomit-frame-pointer2019090220190816
163204124 0 416032 904 808dolbeau/arm-neongcc_-O_-fomit-frame-pointer2019090220190816
163204124 0 416032 904 808dolbeau/arm-neongcc_-fno-schedule-insns_-O_-fomit-frame-pointer2019090220190816
168003288 0 414278 888 800dolbeau/generic-gccsimd128gcc_-funroll-loops_-Os_-fomit-frame-pointer2019090220190816
168003288 0 414278 888 800dolbeau/generic-gccsimd128gcc_-funroll-loops_-fno-schedule-insns_-Os_-fomit-frame-pointer2019090220190816
168802176 0 413912 904 808e/mergedgcc_-O2_-fomit-frame-pointer2019090220190816
168803020 0 416113 912 824e/mergedgcc_-O3_-fomit-frame-pointer2019090220190816
168803840 0 417664 904 808e/mergedgcc_-funroll-loops_-O2_-fomit-frame-pointer2019090220190816
168804408 0 418881 912 824e/mergedgcc_-funroll-loops_-O3_-fomit-frame-pointer2019090220190816
169603744 0 415440 904 808dolbeau/generic-gccsimd128gcc_-fno-schedule-insns_-O2_-fomit-frame-pointer2019090220190816
169604548 0 417609 912 824dolbeau/generic-gccsimd128gcc_-fno-schedule-insns_-O3_-fomit-frame-pointer2019090220190816
169605396 0 419304 904 808dolbeau/generic-gccsimd128gcc_-funroll-loops_-fno-schedule-insns_-O2_-fomit-frame-pointer2019090220190816
169605924 0 424449 912 824dolbeau/generic-gccsimd128gcc_-funroll-loops_-fno-schedule-insns_-O3_-fomit-frame-pointer2019090220190816
170403288 0 414214 888 800dolbeau/generic-gccsimd128gcc_-Os_-fomit-frame-pointer2019090220190816
170403288 0 414214 888 800dolbeau/generic-gccsimd128gcc_-fno-schedule-insns_-Os_-fomit-frame-pointer2019090220190816
173602120 0 413776 904 808e/mergedgcc_-fno-schedule-insns_-O2_-fomit-frame-pointer2019090220190816
173602956 0 415993 912 824e/mergedgcc_-fno-schedule-insns_-O3_-fomit-frame-pointer2019090220190816
176002748 0 415849 912 824e/refgcc_-O3_-fomit-frame-pointer2019090220190816
176002764 0 415809 912 824e/refgcc_-fno-schedule-insns_-O3_-fomit-frame-pointer2019090220190816
176004104 0 418585 912 824e/refgcc_-funroll-loops_-O3_-fomit-frame-pointer2019090220190816
176002748 0 415849 912 824e/regsgcc_-O3_-fomit-frame-pointer2019090220190816
176002756 0 415801 912 824e/regsgcc_-fno-schedule-insns_-O3_-fomit-frame-pointer2019090220190816
176004104 0 418585 912 824e/regsgcc_-funroll-loops_-O3_-fomit-frame-pointer2019090220190816
176801892 0 412806 888 800e/mergedgcc_-Os_-fomit-frame-pointer2019090220190816
176801892 0 412806 888 800e/mergedgcc_-fno-schedule-insns_-Os_-fomit-frame-pointer2019090220190816
176801892 0 412870 888 800e/mergedgcc_-funroll-loops_-Os_-fomit-frame-pointer2019090220190816
176801892 0 412870 888 800e/mergedgcc_-funroll-loops_-fno-schedule-insns_-Os_-fomit-frame-pointer2019090220190816
187202564 0 116572 800 840e/mergedclang_-mcpu=native_-mfpu=neon_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments2019090220190816
190402516 0 116532 800 840e/mergedclang_-O3_-fwrapv_-mavx2_-fomit-frame-pointer_-Qunused-arguments2019090220190816
190402516 0 116532 800 840e/mergedclang_-O3_-fwrapv_-mavx_-fomit-frame-pointer_-Qunused-arguments2019090220190816
190402516 0 116532 800 840e/mergedclang_-O3_-fwrapv_-mavx_-maes_-mpclmul_-fomit-frame-pointer_-Qunused-arguments2019090220190816
192002472 0 116436 800 840e/mergedclang_-O3_-fomit-frame-pointer_-Qunused-arguments2019090220190816
193604204 0 416128 904 808dolbeau/generic-gccsimd128gcc_-O_-fomit-frame-pointer2019090220190816
193604204 0 416128 904 808dolbeau/generic-gccsimd128gcc_-fno-schedule-insns_-O_-fomit-frame-pointer2019090220190816
193606292 0 421440 904 808dolbeau/generic-gccsimd128gcc_-funroll-loops_-O_-fomit-frame-pointer2019090220190816
193606292 0 421440 904 808dolbeau/generic-gccsimd128gcc_-funroll-loops_-fno-schedule-insns_-O_-fomit-frame-pointer2019090220190816
193602340 0 116348 800 840e/refclang_-mcpu=native_-mfpu=neon_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments2019090220190816
196002412 0 116420 800 840e/regsclang_-mcpu=native_-mfpu=neon_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments2019090220190816
202402404 0 116420 800 840e/refclang_-O3_-fwrapv_-mavx2_-fomit-frame-pointer_-Qunused-arguments2019090220190816
202402404 0 116420 800 840e/refclang_-O3_-fwrapv_-mavx_-fomit-frame-pointer_-Qunused-arguments2019090220190816
202402404 0 116420 800 840e/refclang_-O3_-fwrapv_-mavx_-maes_-mpclmul_-fomit-frame-pointer_-Qunused-arguments2019090220190816
203202420 0 116388 800 840e/refclang_-O3_-fomit-frame-pointer_-Qunused-arguments2019090220190816
208802544 0 116508 800 840e/regsclang_-O3_-fomit-frame-pointer_-Qunused-arguments2019090220190816
209602512 0 116532 800 840e/regsclang_-O3_-fwrapv_-mavx2_-fomit-frame-pointer_-Qunused-arguments2019090220190816
209602512 0 116532 800 840e/regsclang_-O3_-fwrapv_-mavx_-fomit-frame-pointer_-Qunused-arguments2019090220190816
209602512 0 116532 800 840e/regsclang_-O3_-fwrapv_-mavx_-maes_-mpclmul_-fomit-frame-pointer_-Qunused-arguments2019090220190816
245603664 0 417544 904 808e/refgcc_-funroll-loops_-fno-schedule-insns_-O2_-fomit-frame-pointer2019090220190816
248003660 0 417480 904 808e/refgcc_-funroll-loops_-O2_-fomit-frame-pointer2019090220190816
251204340 0 418745 912 824e/mergedgcc_-funroll-loops_-fno-schedule-insns_-O3_-fomit-frame-pointer2019090220190816
252003780 0 417664 904 808e/mergedgcc_-funroll-loops_-fno-schedule-insns_-O2_-fomit-frame-pointer2019090220190816
260003644 0 417472 904 808e/regsgcc_-funroll-loops_-O2_-fomit-frame-pointer2019090220190816
268801964 0 413720 904 808e/regsgcc_-O2_-fomit-frame-pointer2019090220190816
269604100 0 418513 912 824e/regsgcc_-funroll-loops_-fno-schedule-insns_-O3_-fomit-frame-pointer2019090220190816
270404108 0 418521 912 824e/refgcc_-funroll-loops_-fno-schedule-insns_-O3_-fomit-frame-pointer2019090220190816
286401972 0 413648 904 808e/regsgcc_-fno-schedule-insns_-O2_-fomit-frame-pointer2019090220190816
337603656 0 417536 904 808e/regsgcc_-funroll-loops_-fno-schedule-insns_-O2_-fomit-frame-pointer2019090220190816
344001824 0 412782 888 800e/regsgcc_-funroll-loops_-Os_-fomit-frame-pointer2019090220190816
344001824 0 412782 888 800e/regsgcc_-funroll-loops_-fno-schedule-insns_-Os_-fomit-frame-pointer2019090220190816
363201664 0 412566 888 800e/refgcc_-Os_-fomit-frame-pointer2019090220190816
363201664 0 412566 888 800e/refgcc_-fno-schedule-insns_-Os_-fomit-frame-pointer2019090220190816
368001664 0 412630 888 800e/refgcc_-funroll-loops_-Os_-fomit-frame-pointer2019090220190816
368001664 0 412630 888 800e/refgcc_-funroll-loops_-fno-schedule-insns_-Os_-fomit-frame-pointer2019090220190816
380801824 0 412718 888 800e/regsgcc_-Os_-fomit-frame-pointer2019090220190816
380801824 0 412718 888 800e/regsgcc_-fno-schedule-insns_-Os_-fomit-frame-pointer2019090220190816
385601964 0 413712 904 808e/refgcc_-O2_-fomit-frame-pointer2019090220190816
388801948 0 413624 904 808e/refgcc_-fno-schedule-insns_-O2_-fomit-frame-pointer2019090220190816
430403552 0 415448 904 808e/mergedgcc_-O_-fomit-frame-pointer2019090220190816
430403552 0 415448 904 808e/mergedgcc_-fno-schedule-insns_-O_-fomit-frame-pointer2019090220190816
430405144 0 420256 904 808e/mergedgcc_-funroll-loops_-O_-fomit-frame-pointer2019090220190816
430405144 0 420256 904 808e/mergedgcc_-funroll-loops_-fno-schedule-insns_-O_-fomit-frame-pointer2019090220190816
553604560 0 419672 904 808e/regsgcc_-funroll-loops_-O_-fomit-frame-pointer2019090220190816
553604560 0 419672 904 808e/regsgcc_-funroll-loops_-fno-schedule-insns_-O_-fomit-frame-pointer2019090220190816
556802892 0 414784 904 808e/regsgcc_-O_-fomit-frame-pointer2019090220190816
556802892 0 414784 904 808e/regsgcc_-fno-schedule-insns_-O_-fomit-frame-pointer2019090220190816
590402432 0 414328 904 808e/refgcc_-O_-fomit-frame-pointer2019090220190816
590402432 0 414328 904 808e/refgcc_-fno-schedule-insns_-O_-fomit-frame-pointer2019090220190816
633604612 0 419728 904 808e/refgcc_-funroll-loops_-O_-fomit-frame-pointer2019090220190816
633604612 0 419728 904 808e/refgcc_-funroll-loops_-fno-schedule-insns_-O_-fomit-frame-pointer2019090220190816
660809164 0 428376 888 816dolbeau/generic-gccsimd128cc2019090220190816
660809164 0 428376 888 816dolbeau/generic-gccsimd128gcc2019090220190816
660809164 0 428376 888 816dolbeau/generic-gccsimd128gcc_-funroll-loops2019090220190816
1371207192 0 426368 888 816e/mergedcc2019090220190816
1371207192 0 426368 888 816e/mergedgcc2019090220190816
1371207192 0 426368 888 816e/mergedgcc_-funroll-loops2019090220190816
15144013832 0 433016 888 816dolbeau/arm-neoncc2019090220190816
15144013832 0 433016 888 816dolbeau/arm-neongcc2019090220190816
15144013832 0 433016 888 816dolbeau/arm-neongcc_-funroll-loops2019090220190816
1740805844 0 425032 888 816e/regscc2019090220190816
1740805844 0 425032 888 816e/regsgcc2019090220190816
1740805844 0 425032 888 816e/regsgcc_-funroll-loops2019090220190816
2214404192 0 423376 888 816e/refcc2019090220190816
2214404192 0 423376 888 816e/refgcc2019090220190816
2214404192 0 423376 888 816e/refgcc_-funroll-loops2019090220190816

Compiler output

Implementation: dolbeau/arm-sve
Security model: unknown
Compiler: cc
chacha.c: chacha.c:12:10: fatal error: arm_sve.h: No such file or directory
chacha.c: #include <arm_sve.h>
chacha.c: ^~~~~~~~~~~
chacha.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 19, namely:
CompilerImplementations
cc dolbeau/arm-sve
gcc dolbeau/arm-sve
gcc -O2 -fomit-frame-pointer dolbeau/arm-sve
gcc -O3 -fomit-frame-pointer dolbeau/arm-sve
gcc -O -fomit-frame-pointer dolbeau/arm-sve
gcc -Os -fomit-frame-pointer dolbeau/arm-sve
gcc -fno-schedule-insns -O2 -fomit-frame-pointer dolbeau/arm-sve
gcc -fno-schedule-insns -O3 -fomit-frame-pointer dolbeau/arm-sve
gcc -fno-schedule-insns -O -fomit-frame-pointer dolbeau/arm-sve
gcc -fno-schedule-insns -Os -fomit-frame-pointer dolbeau/arm-sve
gcc -funroll-loops dolbeau/arm-sve
gcc -funroll-loops -O2 -fomit-frame-pointer dolbeau/arm-sve
gcc -funroll-loops -O3 -fomit-frame-pointer dolbeau/arm-sve
gcc -funroll-loops -O -fomit-frame-pointer dolbeau/arm-sve
gcc -funroll-loops -Os -fomit-frame-pointer dolbeau/arm-sve
gcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer dolbeau/arm-sve
gcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer dolbeau/arm-sve
gcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer dolbeau/arm-sve
gcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer dolbeau/arm-sve

Compiler output

Implementation: dolbeau/arm-sve
Security model: unknown
Compiler: clang -O3 -fomit-frame-pointer -Qunused-arguments
chacha.c: chacha.c:12:10: fatal error: 'arm_sve.h' file not found
chacha.c: #include <arm_sve.h>
chacha.c: ^~~~~~~~~~~
chacha.c: 1 error generated.

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -O3 -fomit-frame-pointer -Qunused-arguments dolbeau/arm-sve
clang -O3 -fwrapv -mavx2 -fomit-frame-pointer -Qunused-arguments dolbeau/arm-sve
clang -O3 -fwrapv -mavx -fomit-frame-pointer -Qunused-arguments dolbeau/arm-sve
clang -O3 -fwrapv -mavx -maes -mpclmul -fomit-frame-pointer -Qunused-arguments dolbeau/arm-sve
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments dolbeau/arm-sve

Compiler output

Implementation: krovetz/vec128
Security model: unknown
Compiler: cc
stream.c: stream.c:80:2: error: #error -- Implementation supports only machines with neon, altivec or SSE2
stream.c: #error -- Implementation supports only machines with neon, altivec or SSE2
stream.c: ^~~~~
stream.c: stream.c: In function 'crypto_stream_chacha20_krovetz_vec128_xor':
stream.c: stream.c:151:14: warning: implicit declaration of function 'NONCE' [-Wimplicit-function-declaration]
stream.c: vec s3 = NONCE(np);
stream.c: ^~~~~
stream.c: stream.c:151:14: error: incompatible types when initializing type 'vec' {aka '__vector(4) unsigned int'} using type 'int'
stream.c: stream.c:91:19: error: 'VBPI' undeclared (first use in this function); did you mean 'BPI'?
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^~~~
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: ^~~
stream.c: stream.c:91:19: note: each undeclared identifier is reported only once for each function it appears in
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^~~~
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: ^~~
stream.c: stream.c:91:26: error: 'GPR_TOO' undeclared (first use in this function)
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^~~~~~~
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: ...

Number of similar (compiler,implementation) pairs: 19, namely:
CompilerImplementations
cc krovetz/vec128
gcc krovetz/vec128
gcc -O2 -fomit-frame-pointer krovetz/vec128
gcc -O3 -fomit-frame-pointer krovetz/vec128
gcc -O -fomit-frame-pointer krovetz/vec128
gcc -Os -fomit-frame-pointer krovetz/vec128
gcc -fno-schedule-insns -O2 -fomit-frame-pointer krovetz/vec128
gcc -fno-schedule-insns -O3 -fomit-frame-pointer krovetz/vec128
gcc -fno-schedule-insns -O -fomit-frame-pointer krovetz/vec128
gcc -fno-schedule-insns -Os -fomit-frame-pointer krovetz/vec128
gcc -funroll-loops krovetz/vec128
gcc -funroll-loops -O2 -fomit-frame-pointer krovetz/vec128
gcc -funroll-loops -O3 -fomit-frame-pointer krovetz/vec128
gcc -funroll-loops -O -fomit-frame-pointer krovetz/vec128
gcc -funroll-loops -Os -fomit-frame-pointer krovetz/vec128
gcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer krovetz/vec128
gcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer krovetz/vec128
gcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer krovetz/vec128
gcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer krovetz/vec128

Compiler output

Implementation: krovetz/vec128
Security model: unknown
Compiler: clang -O3 -fomit-frame-pointer -Qunused-arguments
stream.c: stream.c:80:2: error: -- Implementation supports only machines with neon, altivec or SSE2
stream.c: #error -- Implementation supports only machines with neon, altivec or SSE2
stream.c: ^
stream.c: stream.c:151:14: warning: implicit declaration of function 'NONCE' is invalid in C99 [-Wimplicit-function-declaration]
stream.c: vec s3 = NONCE(np);
stream.c: ^
stream.c: stream.c:151:9: error: initializing 'vec' (vector of 4 'unsigned int' values) with an expression of incompatible type 'int'
stream.c: vec s3 = NONCE(np);
stream.c: ^ ~~~~~~~~~
stream.c: stream.c:152:36: error: use of undeclared identifier 'VBPI'
stream.c: for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: ^
stream.c: stream.c:91:19: note: expanded from macro 'BPI'
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^
stream.c: stream.c:152:36: error: use of undeclared identifier 'GPR_TOO'
stream.c: stream.c:91:26: note: expanded from macro 'BPI'
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^
stream.c: stream.c:155:19: error: use of undeclared identifier 'ONE'
stream.c: v7 = v3 + ONE;
stream.c: ^
stream.c: stream.c:176:13: warning: implicit declaration of function 'ROTW16' is invalid in C99 [-Wimplicit-function-declaration]
stream.c: DQROUND_VECTORS(v0,v1,v2,v3)
stream.c: ^
stream.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -O3 -fomit-frame-pointer -Qunused-arguments krovetz/vec128
clang -O3 -fwrapv -mavx2 -fomit-frame-pointer -Qunused-arguments krovetz/vec128
clang -O3 -fwrapv -mavx -fomit-frame-pointer -Qunused-arguments krovetz/vec128
clang -O3 -fwrapv -mavx -maes -mpclmul -fomit-frame-pointer -Qunused-arguments krovetz/vec128
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments krovetz/vec128