Implementation notes: aarch64, hikey970, crypto_stream/chacha8

Computer: hikey970
Architecture: aarch64
CPU ID: 410fd034
SUPERCOP version: 20191221
Operation: crypto_stream
Primitive: chacha8
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
38423092 0 414663 928 808dolbeau/arm-neongcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
38424360 0 417616 936 840dolbeau/arm-neongcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
38422884 0 413699 912 808dolbeau/arm-neongcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
57633784 0 415431 928 808dolbeau/arm-neongcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
76842824 0 416080 936 840dolbeau/mipsel-msagcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
76842252 0 413815 928 808e/mergedgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
76843464 0 416704 936 840e/mergedgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
76842048 0 412851 912 808e/mergedgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
76842812 0 416056 936 840e/refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
76842780 0 416024 936 840e/regsgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
96053256 0 414895 928 808e/mergedgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
115261936 0 412739 912 808e/regsgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
134472076 0 413711 928 808e/refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
134471380 0 412187 912 808e/refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
153682076 0 413727 928 808dolbeau/mipsel-msagcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
172892112 0 413671 928 808e/regsgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
230521596 0 413175 928 808dolbeau/mipsel-msagcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
230521380 0 412203 912 808dolbeau/mipsel-msagcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
230521576 0 413135 928 808e/refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017
249732560 0 414199 928 808e/regsgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121320191017

Compiler output

Implementation: amd64-ssse3
Security model: unknown
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
chacha.s: chacha.s: Assembler messages:
chacha.s: chacha.s:19: Error: operand 1 must be an integer register -- `mov %rsp,%r11'
chacha.s: chacha.s:20: Error: operand 1 must be an integer or stack pointer register -- `and $31,%r11'
chacha.s: chacha.s:21: Error: operand 1 must be an integer or stack pointer register -- `add $384,%r11'
chacha.s: chacha.s:22: Error: operand 1 must be an integer or stack pointer register -- `sub %r11,%rsp'
chacha.s: chacha.s:23: Error: operand 1 must be an integer register -- `mov %rdi,%r8'
chacha.s: chacha.s:24: Error: operand 1 must be an integer register -- `mov %rsi,%rsi'
chacha.s: chacha.s:25: Error: operand 1 must be an integer register -- `mov %rsi,%rdi'
chacha.s: chacha.s:26: Error: operand 1 must be an integer register -- `mov %rdx,%rdx'
chacha.s: chacha.s:27: Error: operand 1 must be an integer or stack pointer register -- `cmp $0,%rdx'
chacha.s: chacha.s:29: Error: unknown mnemonic `jbe' -- `jbe ._done'
chacha.s: chacha.s:31: Error: operand 1 must be an integer register -- `mov $0,%rax'
chacha.s: chacha.s:33: Error: operand 1 must be an integer register -- `mov %rdx,%rcx'
chacha.s: chacha.s:35: Error: unknown mnemonic `rep' -- `rep stosb'
chacha.s: chacha.s:37: Error: operand 1 must be an integer or stack pointer register -- `sub %rdx,%rdi'
chacha.s: chacha.s:39: Error: unknown mnemonic `jmp' -- `jmp ._start'
chacha.s: chacha.s:47: Error: operand 1 must be an integer register -- `mov %rsp,%r11'
chacha.s: chacha.s:48: Error: operand 1 must be an integer or stack pointer register -- `and $31,%r11'
chacha.s: chacha.s:49: Error: operand 1 must be an integer or stack pointer register -- `add $384,%r11'
chacha.s: chacha.s:50: Error: operand 1 must be an integer or stack pointer register -- `sub %r11,%rsp'
chacha.s: chacha.s:52: Error: operand 1 must be an integer register -- `mov %rdi,%r8'
chacha.s: chacha.s:54: Error: operand 1 must be an integer register -- `mov %rsi,%rsi'
chacha.s: chacha.s:56: Error: operand 1 must be an integer register -- `mov %rdx,%rdi'
chacha.s: chacha.s:58: Error: operand 1 must be an integer register -- `mov %rcx,%rdx'
chacha.s: chacha.s:60: Error: operand 1 must be an integer or stack pointer register -- `cmp $0,%rdx'
chacha.s: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE amd64-ssse3
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE amd64-ssse3
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE amd64-ssse3
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE amd64-ssse3

Compiler output

Implementation: goll_gueron
Security model: unknown
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
stream.c: stream.c:11:23: fatal error: immintrin.h: No such file or directory
stream.c: #include <immintrin.h>
stream.c: ^
stream.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE goll_gueron
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE goll_gueron
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE goll_gueron
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE goll_gueron

Compiler output

Implementation: krovetz/avx2
Security model: unknown
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
stream.c: stream.c:8:23: fatal error: immintrin.h: No such file or directory
stream.c: #include <immintrin.h>
stream.c: ^
stream.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/avx2

Compiler output

Implementation: krovetz/vec128
Security model: unknown
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
stream.c: stream.c:80:2: error: #error -- Implementation supports only machines with neon, altivec or SSE2
stream.c: #error -- Implementation supports only machines with neon, altivec or SSE2
stream.c: ^~~~~
stream.c: stream.c: In function 'crypto_stream_chacha8_krovetz_vec128_xor':
stream.c: stream.c:151:14: warning: implicit declaration of function 'NONCE' [-Wimplicit-function-declaration]
stream.c: vec s3 = NONCE(np);
stream.c: ^~~~~
stream.c: stream.c:151:14: error: incompatible types when initializing type 'vec {aka __vector(4) unsigned int}' using type 'int'
stream.c: stream.c:91:19: error: 'VBPI' undeclared (first use in this function)
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: ^~~
stream.c: stream.c:91:19: note: each undeclared identifier is reported only once for each function it appears in
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: ^~~
stream.c: stream.c:91:26: error: 'GPR_TOO' undeclared (first use in this function)
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/vec128
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/vec128
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/vec128
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/vec128