Implementation notes: aarch64, lionheart30, crypto_stream/chacha20

Computer: lionheart30
Architecture: aarch64
CPU ID: unknown CPU ID
SUPERCOP version: 20160806
Operation: crypto_stream
Primitive: chacha20
TimeImplementationCompilerBenchmark dateSUPERCOP version
7909dolbeau/arm-neongcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016082020160806
7968dolbeau/arm-neongcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016082020160806
10933dolbeau/arm-neongcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016082020160806
13715dolbeau/mipsel-msagcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016082020160806
13749e/regsgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016082020160806
13826e/mergedgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016082020160806
13886e/mergedgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016082020160806
13898e/refgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016082020160806
14003e/mergedgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016082020160806
14386dolbeau/arm-neongcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016082020160806
16575e/mergedgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016082020160806
16617e/mergedclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016082020160806
18343e/regsgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016082020160806
18999e/regsclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016082020160806
20102e/regsgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016082020160806
20141e/regsgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016082020160806
20616e/refclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016082020160806
21376dolbeau/mipsel-msagcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016082020160806
21802e/refgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016082020160806
21803dolbeau/mipsel-msagcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016082020160806
21882e/refgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016082020160806
23294dolbeau/mipsel-msagcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016082020160806
23346e/refgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016082020160806

Compiler output

Implementation: crypto_stream/chacha20/dolbeau/ppc-altivec
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
chacha.c: In file included from chacha.c:11:
chacha.c: /usr/bin/../lib/clang/3.4/include/altivec.h:27:2: error: "AltiVec support not enabled"
chacha.c: #error "AltiVec support not enabled"
chacha.c: ^
chacha.c: /usr/bin/../lib/clang/3.4/include/altivec.h:39:8: error: unknown type name 'vector'
chacha.c: static vector signed char __ATTRS_o_ai
chacha.c: ^
chacha.c: /usr/bin/../lib/clang/3.4/include/altivec.h:39:15: error: expected identifier or '('
chacha.c: static vector signed char __ATTRS_o_ai
chacha.c: ^
chacha.c: /usr/bin/../lib/clang/3.4/include/altivec.h:42:8: error: unknown type name 'vector'
chacha.c: static vector unsigned char __ATTRS_o_ai
chacha.c: ^
chacha.c: /usr/bin/../lib/clang/3.4/include/altivec.h:42:15: error: expected identifier or '('
chacha.c: static vector unsigned char __ATTRS_o_ai
chacha.c: ^
chacha.c: /usr/bin/../lib/clang/3.4/include/altivec.h:47:8: error: unknown type name 'vector'
chacha.c: static vector bool char __ATTRS_o_ai
chacha.c: ^
chacha.c: /usr/bin/../lib/clang/3.4/include/altivec.h:47:19: error: expected ';' after top level declarator
chacha.c: static vector bool char __ATTRS_o_ai
chacha.c: ^
chacha.c: /usr/bin/../lib/clang/3.4/include/altivec.h:48:10: error: unknown type name 'vector'
chacha.c: vec_perm(vector bool char __a, vector bool char __b, vector unsigned char __c);
chacha.c: ^
chacha.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments dolbeau/ppc-altivec

Compiler output

Implementation: crypto_stream/chacha20/dolbeau/arm-neon
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
chacha.c: In file included from chacha.c:11:
chacha.c: /usr/bin/../lib/clang/3.4/include/arm_neon.h:28:2: error: "NEON support not enabled"
chacha.c: #error "NEON support not enabled"
chacha.c: ^
chacha.c: /usr/bin/../lib/clang/3.4/include/arm_neon.h:47:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(8))) int8_t int8x8_t;
chacha.c: ^
chacha.c: /usr/bin/../lib/clang/3.4/include/arm_neon.h:48:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(16))) int8_t int8x16_t;
chacha.c: ^
chacha.c: /usr/bin/../lib/clang/3.4/include/arm_neon.h:49:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(4))) int16_t int16x4_t;
chacha.c: ^
chacha.c: /usr/bin/../lib/clang/3.4/include/arm_neon.h:50:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(8))) int16_t int16x8_t;
chacha.c: ^
chacha.c: /usr/bin/../lib/clang/3.4/include/arm_neon.h:51:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(2))) int32_t int32x2_t;
chacha.c: ^
chacha.c: /usr/bin/../lib/clang/3.4/include/arm_neon.h:52:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(4))) int32_t int32x4_t;
chacha.c: ^
chacha.c: /usr/bin/../lib/clang/3.4/include/arm_neon.h:53:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(1))) int64_t int64x1_t;
chacha.c: ^
chacha.c: ...

Number of similar (compiler,implementation) pairs: 2, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments dolbeau/arm-neon dolbeau/mipsel-msa

Compiler output

Implementation: crypto_stream/chacha20/amd64-ssse3
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
chacha.s: chacha.s:22:5: error: unexpected token in operand
chacha.s: mov %rsp,%r11
chacha.s: ^
chacha.s: chacha.s:23:5: error: invalid token in expression
chacha.s: and $31,%r11
chacha.s: ^
chacha.s: chacha.s:24:5: error: invalid token in expression
chacha.s: add $384,%r11
chacha.s: ^
chacha.s: chacha.s:25:5: error: unexpected token in operand
chacha.s: sub %r11,%rsp
chacha.s: ^
chacha.s: chacha.s:26:6: error: unexpected token in operand
chacha.s: mov %rdi,%r8
chacha.s: ^
chacha.s: chacha.s:27:6: error: unexpected token in operand
chacha.s: mov %rsi,%rsi
chacha.s: ^
chacha.s: chacha.s:28:6: error: unexpected token in operand
chacha.s: mov %rsi,%rdi
chacha.s: ^
chacha.s: chacha.s:29:6: error: unexpected token in operand
chacha.s: mov %rdx,%rdx
chacha.s: ^
chacha.s: chacha.s:30:6: error: invalid token in expression
chacha.s: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments amd64-ssse3

Compiler output

Implementation: crypto_stream/chacha20/goll_gueron
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
stream.c: stream.c:126:2: error: -- Implementation supports only microarchitectures with support for Advanced Vector Extensions (AVX2 or AVX512).
stream.c: #error -- Implementation supports only microarchitectures with support for Advanced Vector Extensions (AVX2 or AVX512).
stream.c: ^
stream.c: 1 error generated.

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments goll_gueron

Compiler output

Implementation: crypto_stream/chacha20/krovetz/avx2
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
stream.c: stream.c:54:5: error: use of undeclared identifier '__m256i'
stream.c: __m256i v0,v1,v2,v3,v4,v5,v6,v7,v8,v9,v10,v11;
stream.c: ^
stream.c: stream.c:56:5: error: use of undeclared identifier '__m256i'
stream.c: __m256i s0 = _mm_broadcastsi128_si256((__m128i *)sigma);
stream.c: ^
stream.c: stream.c:60:5: error: use of undeclared identifier '__m256i'
stream.c: __m256i s1 = _mm256_loadu_si256((__m256i *)k);
stream.c: ^
stream.c: stream.c:61:5: error: use of undeclared identifier '__m256i'
stream.c: __m256i s2 = _mm256_permute2x128_si256(s1,s1,0x11);
stream.c: ^
stream.c: stream.c:62:5: error: use of undeclared identifier 's1'
stream.c: s1 = _mm256_permute2x128_si256(s1,s1,0x00);
stream.c: ^
stream.c: stream.c:62:10: warning: implicit declaration of function '_mm256_permute2x128_si256' is invalid in C99 [-Wimplicit-function-declaration]
stream.c: s1 = _mm256_permute2x128_si256(s1,s1,0x00);
stream.c: ^
stream.c: stream.c:62:36: error: use of undeclared identifier 's1'
stream.c: s1 = _mm256_permute2x128_si256(s1,s1,0x00);
stream.c: ^
stream.c: stream.c:63:5: error: use of undeclared identifier '__m256i'
stream.c: __m256i s3 = _mm256_or_si256(
stream.c: ^
stream.c: stream.c:68:9: error: use of undeclared identifier 'v8'
stream.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments krovetz/avx2

Compiler output

Implementation: crypto_stream/chacha20/krovetz/vec128
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
stream.c: stream.c:80:2: error: -- Implementation supports only machines with neon, altivec or SSE2
stream.c: #error -- Implementation supports only machines with neon, altivec or SSE2
stream.c: ^
stream.c: stream.c:151:14: warning: implicit declaration of function 'NONCE' is invalid in C99 [-Wimplicit-function-declaration]
stream.c: vec s3 = NONCE(np);
stream.c: ^
stream.c: stream.c:151:9: error: initializing 'vec' with an expression of incompatible type 'int'
stream.c: vec s3 = NONCE(np);
stream.c: ^ ~~~~~~~~~
stream.c: stream.c:152:36: error: use of undeclared identifier 'VBPI'
stream.c: for (iters = 0; iters stream.c: ^
stream.c: stream.c:91:19: note: expanded from macro 'BPI'
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^
stream.c: stream.c:152:36: error: use of undeclared identifier 'GPR_TOO'
stream.c: stream.c:91:26: note: expanded from macro 'BPI'
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^
stream.c: stream.c:155:19: error: use of undeclared identifier 'ONE'
stream.c: v7 = v3 + ONE;
stream.c: ^
stream.c: stream.c:176:13: warning: implicit declaration of function 'ROTW16' is invalid in C99 [-Wimplicit-function-declaration]
stream.c: DQROUND_VECTORS(v0,v1,v2,v3)
stream.c: ^
stream.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments krovetz/vec128

Compiler output

Implementation: crypto_stream/chacha20/dolbeau/ppc-altivec
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
chacha.c: chacha.c:11:21: fatal error: altivec.h: No such file or directory
chacha.c: #include gt;
chacha.c: ^
chacha.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv dolbeau/ppc-altivec
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv dolbeau/ppc-altivec
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv dolbeau/ppc-altivec
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv dolbeau/ppc-altivec

Compiler output

Implementation: crypto_stream/chacha20/amd64-ssse3
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
chacha.s: chacha.s: Assembler messages:
chacha.s: chacha.s:22: Error: operand 1 should be an integer register -- `mov %rsp,%r11'
chacha.s: chacha.s:23: Error: operand 1 should be an integer or stack pointer register -- `and $31,%r11'
chacha.s: chacha.s:24: Error: operand 1 should be an integer or stack pointer register -- `add $384,%r11'
chacha.s: chacha.s:25: Error: operand 1 should be an integer or stack pointer register -- `sub %r11,%rsp'
chacha.s: chacha.s:26: Error: operand 1 should be an integer register -- `mov %rdi,%r8'
chacha.s: chacha.s:27: Error: operand 1 should be an integer register -- `mov %rsi,%rsi'
chacha.s: chacha.s:28: Error: operand 1 should be an integer register -- `mov %rsi,%rdi'
chacha.s: chacha.s:29: Error: operand 1 should be an integer register -- `mov %rdx,%rdx'
chacha.s: chacha.s:30: Error: operand 1 should be an integer or stack pointer register -- `cmp $0,%rdx'
chacha.s: chacha.s:32: Error: unknown mnemonic `jbe' -- `jbe ._done'
chacha.s: chacha.s:34: Error: operand 1 should be an integer register -- `mov $0,%rax'
chacha.s: chacha.s:36: Error: operand 1 should be an integer register -- `mov %rdx,%rcx'
chacha.s: chacha.s:38: Error: unknown mnemonic `rep' -- `rep stosb'
chacha.s: chacha.s:40: Error: operand 1 should be an integer or stack pointer register -- `sub %rdx,%rdi'
chacha.s: chacha.s:42: Error: unknown mnemonic `jmp' -- `jmp ._start'
chacha.s: chacha.s:50: Error: operand 1 should be an integer register -- `mov %rsp,%r11'
chacha.s: chacha.s:51: Error: operand 1 should be an integer or stack pointer register -- `and $31,%r11'
chacha.s: chacha.s:52: Error: operand 1 should be an integer or stack pointer register -- `add $384,%r11'
chacha.s: chacha.s:53: Error: operand 1 should be an integer or stack pointer register -- `sub %r11,%rsp'
chacha.s: chacha.s:55: Error: operand 1 should be an integer register -- `mov %rdi,%r8'
chacha.s: chacha.s:57: Error: operand 1 should be an integer register -- `mov %rsi,%rsi'
chacha.s: chacha.s:59: Error: operand 1 should be an integer register -- `mov %rdx,%rdi'
chacha.s: chacha.s:61: Error: operand 1 should be an integer register -- `mov %rcx,%rdx'
chacha.s: chacha.s:63: Error: operand 1 should be an integer or stack pointer register -- `cmp $0,%rdx'
chacha.s: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv amd64-ssse3
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv amd64-ssse3
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv amd64-ssse3
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv amd64-ssse3

Compiler output

Implementation: crypto_stream/chacha20/goll_gueron
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
stream.c: stream.c:11:23: fatal error: immintrin.h: No such file or directory
stream.c: #include gt;
stream.c: ^
stream.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv goll_gueron
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv goll_gueron
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv goll_gueron
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv goll_gueron

Compiler output

Implementation: crypto_stream/chacha20/krovetz/vec128
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
stream.c: stream.c:80:2: error: #error -- Implementation supports only machines with neon, altivec or SSE2
stream.c: #error -- Implementation supports only machines with neon, altivec or SSE2
stream.c: ^~~~~
stream.c: stream.c: In function 'crypto_stream_chacha20_krovetz_vec128_xor':
stream.c: stream.c:151:14: warning: implicit declaration of function 'NONCE' [-Wimplicit-function-declaration]
stream.c: vec s3 = NONCE(np);
stream.c: ^~~~~
stream.c: stream.c:151:14: error: incompatible types when initializing type 'vec {aka __vector(4) unsigned int}' using type 'int'
stream.c: stream.c:91:19: error: 'VBPI' undeclared (first use in this function)
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: for (iters = 0; iters stream.c: ^~~
stream.c: stream.c:91:19: note: each undeclared identifier is reported only once for each function it appears in
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: for (iters = 0; iters stream.c: ^~~
stream.c: stream.c:91:26: error: 'GPR_TOO' undeclared (first use in this function)
stream.c: #define BPI (VBPI + GPR_TOO) /* Blocks computed per loop iteration */
stream.c: ^
stream.c: stream.c:152:36: note: in expansion of macro 'BPI'
stream.c: for (iters = 0; iters stream.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv krovetz/vec128
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv krovetz/vec128
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv krovetz/vec128
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv krovetz/vec128

Compiler output

Implementation: crypto_stream/chacha20/krovetz/avx2
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
stream.c: stream.c:8:23: fatal error: immintrin.h: No such file or directory
stream.c: #include gt;
stream.c: ^
stream.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv krovetz/avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv krovetz/avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv krovetz/avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv krovetz/avx2