Implementation notes: aarch64, warbear0, crypto_stream/chacha12

Computer: warbear0
Architecture: aarch64
CPU ID: 411fd072
SUPERCOP version: 20200826
Operation: crypto_stream
Primitive: chacha12
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
51683836 0 417012 816 808dolbeau/arm-neongcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
51684644 0 419041 824 824dolbeau/arm-neongcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
62563344 0 415572 800 800dolbeau/arm-neongcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
78562908 0 417329 824 824dolbeau/mipsel-msagcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
79042932 0 417337 824 824e/refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
79362104 0 414332 800 800e/mergedgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
79523212 0 417617 824 824e/mergedgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
79682436 0 415588 816 808e/mergedgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
80804536 0 418937 824 824e/regsgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
87044276 0 417452 816 808dolbeau/arm-neongcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
113923736 0 416884 816 808e/mergedgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
126402200 0 415364 816 808e/regsgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
147683140 0 416300 816 808e/regsgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
156961976 0 414204 800 800e/regsgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
170241840 0 414076 800 800dolbeau/mipsel-msagcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
170241840 0 414068 800 800e/refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
170402180 0 415356 816 808dolbeau/mipsel-msagcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
174722152 0 415308 816 808e/refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
178242612 0 415788 816 808dolbeau/mipsel-msagcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
178242612 0 415772 816 808e/refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826

Compiler output

Implementation: amd64-ssse3
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
chacha.S: chacha.S: Assembler messages:
chacha.S: chacha.S:19: Error: operand 1 must be an integer register -- `mov %rsp,%r11'
chacha.S: chacha.S:20: Error: operand 1 must be an integer or stack pointer register -- `and $31,%r11'
chacha.S: chacha.S:21: Error: operand 1 must be an integer or stack pointer register -- `add $384,%r11'
chacha.S: chacha.S:22: Error: operand 1 must be an integer or stack pointer register -- `sub %r11,%rsp'
chacha.S: chacha.S:23: Error: operand 1 must be an integer register -- `mov %rdi,%r8'
chacha.S: chacha.S:24: Error: operand 1 must be an integer register -- `mov %rsi,%rsi'
chacha.S: chacha.S:25: Error: operand 1 must be an integer register -- `mov %rsi,%rdi'
chacha.S: chacha.S:26: Error: operand 1 must be an integer register -- `mov %rdx,%rdx'
chacha.S: chacha.S:27: Error: operand 1 must be an integer or stack pointer register -- `cmp $0,%rdx'
chacha.S: chacha.S:29: Error: unknown mnemonic `jbe' -- `jbe ._done'
chacha.S: chacha.S:31: Error: operand 1 must be an integer register -- `mov $0,%rax'
chacha.S: chacha.S:33: Error: operand 1 must be an integer register -- `mov %rdx,%rcx'
chacha.S: chacha.S:35: Error: unknown mnemonic `rep' -- `rep stosb'
chacha.S: chacha.S:37: Error: operand 1 must be an integer or stack pointer register -- `sub %rdx,%rdi'
chacha.S: chacha.S:39: Error: unknown mnemonic `jmp' -- `jmp ._start'
chacha.S: chacha.S:47: Error: operand 1 must be an integer register -- `mov %rsp,%r11'
chacha.S: chacha.S:48: Error: operand 1 must be an integer or stack pointer register -- `and $31,%r11'
chacha.S: chacha.S:49: Error: operand 1 must be an integer or stack pointer register -- `add $384,%r11'
chacha.S: chacha.S:50: Error: operand 1 must be an integer or stack pointer register -- `sub %r11,%rsp'
chacha.S: chacha.S:52: Error: operand 1 must be an integer register -- `mov %rdi,%r8'
chacha.S: chacha.S:54: Error: operand 1 must be an integer register -- `mov %rsi,%rsi'
chacha.S: chacha.S:56: Error: operand 1 must be an integer register -- `mov %rdx,%rdi'
chacha.S: chacha.S:58: Error: operand 1 must be an integer register -- `mov %rcx,%rdx'
chacha.S: chacha.S:60: Error: operand 1 must be an integer or stack pointer register -- `cmp $0,%rdx'
chacha.S: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE amd64-ssse3
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE amd64-ssse3
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE amd64-ssse3
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE amd64-ssse3

Compiler output

Implementation: goll_gueron
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
stream.c: stream.c:11:10: fatal error: immintrin.h: No such file or directory
stream.c: 11 | #include <immintrin.h>
stream.c: | ^~~~~~~~~~~~~
stream.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE goll_gueron
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE goll_gueron
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE goll_gueron
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE goll_gueron

Compiler output

Implementation: krovetz/avx2
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
stream.c: stream.c:8:10: fatal error: immintrin.h: No such file or directory
stream.c: 8 | #include <immintrin.h>
stream.c: | ^~~~~~~~~~~~~
stream.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/avx2

Compiler output

Implementation: krovetz/vec128
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
stream.c: stream.c:80:2: error: #error -- Implementation supports only machines with neon, altivec or SSE2
stream.c: 80 | #error -- Implementation supports only machines with neon, altivec or SSE2
stream.c: | ^~~~~
stream.c: stream.c: In function 'crypto_stream_chacha12_krovetz_vec128_constbranchindex_xor':
stream.c: stream.c:151:14: warning: implicit declaration of function 'NONCE' [-Wimplicit-function-declaration]
stream.c: 151 | vec s3 = NONCE(np);
stream.c: | ^~~~~
stream.c: stream.c:151:14: error: incompatible types when initializing type 'vec' {aka '__vector(4) unsigned int'} using type 'int'
stream.c: stream.c:91:19: error: 'VBPI' undeclared (first use in this function); did you mean 'BPI'?
stream.c: 91 | #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: | ^~~~
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: 152 | for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: | ^~~
stream.c: stream.c:91:19: note: each undeclared identifier is reported only once for each function it appears in
stream.c: 91 | #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: | ^~~~
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: 152 | for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: | ^~~
stream.c: stream.c:91:26: error: 'GPR_TOO' undeclared (first use in this function)
stream.c: 91 | #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: | ^~~~~~~
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: 152 | for (iters = 0; iters < inlen/(BPI*64); iters++) {
stream.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/vec128
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/vec128
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/vec128
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE krovetz/vec128