Implementation notes: aarch64, warbear0, crypto_stream/chacha8

Computer: warbear0
Architecture: aarch64
CPU ID: 411fd072
SUPERCOP version: 20200826
Operation: crypto_stream
Primitive: chacha8
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
36643820 0 416996 816 808dolbeau/arm-neongcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
36644548 0 418953 824 824dolbeau/arm-neongcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
43683344 0 415572 800 800dolbeau/arm-neongcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
55362908 0 417329 824 824dolbeau/mipsel-msagcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
55522420 0 415572 816 808e/mergedgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
55523180 0 417585 824 824e/mergedgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
55842932 0 417337 824 824e/refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
56002104 0 414332 800 800e/mergedgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
57444536 0 418937 824 824e/regsgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
60644276 0 417452 816 808dolbeau/arm-neongcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
87683736 0 416884 816 808e/mergedgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
101762184 0 415348 816 808e/regsgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
121603140 0 416300 816 808e/regsgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
133921976 0 414204 800 800e/regsgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
146402136 0 415292 816 808e/refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
146882196 0 415372 816 808dolbeau/mipsel-msagcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
147041840 0 414076 800 800dolbeau/mipsel-msagcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
147201840 0 414068 800 800e/refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
152962612 0 415788 816 808dolbeau/mipsel-msagcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
152962612 0 415772 816 808e/refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826

Compiler output

Implementation: amd64-ssse3
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
chacha.S: chacha.S: Assembler messages:
chacha.S: chacha.S:19: Error: operand 1 must be an integer register -- `mov %rsp,%r11'
chacha.S: chacha.S:20: Error: operand 1 must be an integer or stack pointer register -- `and $31,%r11'
chacha.S: chacha.S:21: Error: operand 1 must be an integer or stack pointer register -- `add $384,%r11'
chacha.S: chacha.S:22: Error: operand 1 must be an integer or stack pointer register -- `sub %r11,%rsp'
chacha.S: chacha.S:23: Error: operand 1 must be an integer register -- `mov %rdi,%r8'
chacha.S: chacha.S:24: Error: operand 1 must be an integer register -- `mov %rsi,%rsi'
chacha.S: chacha.S:25: Error: operand 1 must be an integer register -- `mov %rsi,%rdi'
chacha.S: chacha.S:26: Error: operand 1 must be an integer register -- `mov %rdx,%rdx'
chacha.S: chacha.S:27: Error: operand 1 must be an integer or stack pointer register -- `cmp $0,%rdx'
chacha.S: chacha.S:29: Error: unknown mnemonic `jbe' -- `jbe ._done'
chacha.S: chacha.S:31: Error: operand 1 must be an integer register -- `mov $0,%rax'
chacha.S: chacha.S:33: Error: operand 1 must be an integer register -- `mov %rdx,%rcx'
chacha.S: chacha.S:35: Error: unknown mnemonic `rep' -- `rep stosb'
chacha.S: chacha.S:37: Error: operand 1 must be an integer or stack pointer register -- `sub %rdx,%rdi'
chacha.S: chacha.S:39: Error: unknown mnemonic `jmp' -- `jmp ._start'
chacha.S: chacha.S:47: Error: operand 1 must be an integer register -- `mov %rsp,%r11'
chacha.S: chacha.S:48: Error: operand 1 must be an integer or stack pointer register -- `and $31,%r11'
chacha.S: chacha.S:49: Error: operand 1 must be an integer or stack pointer register -- `add $384,%r11'
chacha.S: chacha.S:50: Error: operand 1 must be an integer or stack pointer register -- `sub %r11,%rsp'
chacha.S: chacha.S:52: Error: operand 1 must be an integer register -- `mov %rdi,%r8'
chacha.S: chacha.S:54: Error: operand 1 must be an integer register -- `mov %rsi,%rsi'
chacha.S: chacha.S:56: Error: operand 1 must be an integer register -- `mov %rdx,%rdi'
chacha.S: chacha.S:58: Error: operand 1 must be an integer register -- `mov %rcx,%rdx'
chacha.S: chacha.S:60: Error: operand 1 must be an integer or stack pointer register -- `cmp $0,%rdx'
chacha.S: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE amd64-ssse3
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE amd64-ssse3
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE amd64-ssse3
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE amd64-ssse3

Compiler output

Implementation: goll_gueron
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
stream.c: stream.c:11:10: fatal error: immintrin.h: No such file or directory
stream.c: 11 | #include <immintrin.h>
stream.c: | ^~~~~~~~~~~~~
stream.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE goll_gueron
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE goll_gueron
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE goll_gueron
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE goll_gueron

Compiler output

Implementation: krovetz/avx2
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
stream.c: stream.c:8:10: fatal error: immintrin.h: No such file or directory
stream.c: 8 | #include <immintrin.h>
stream.c: | ^~~~~~~~~~~~~
stream.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/avx2

Compiler output

Implementation: krovetz/vec128
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
stream.c: stream.c:80:2: error: #error -- Implementation supports only machines with neon, altivec or SSE2
stream.c: 80 | #error -- Implementation supports only machines with neon, altivec or SSE2
stream.c: | ^~~~~
stream.c: stream.c: In function 'crypto_stream_chacha8_krovetz_vec128_constbranchindex_xor':
stream.c: stream.c:151:14: warning: implicit declaration of function 'NONCE' [-Wimplicit-function-declaration]
stream.c: 151 | vec s3 = NONCE(np);
stream.c: | ^~~~~
stream.c: stream.c:151:14: error: incompatible types when initializing type 'vec' {aka '__vector(4) unsigned int'} using type 'int'
stream.c: stream.c:91:19: error: 'VBPI' undeclared (first use in this function); did you mean 'BPI'?
stream.c: 91 | #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: | ^~~~
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: 152 | for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: | ^~~
stream.c: stream.c:91:19: note: each undeclared identifier is reported only once for each function it appears in
stream.c: 91 | #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: | ^~~~
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: 152 | for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: | ^~~
stream.c: stream.c:91:26: error: 'GPR_TOO' undeclared (first use in this function)
stream.c: 91 | #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: | ^~~~~~~
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: 152 | for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/vec128
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/vec128
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/vec128
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/vec128