Implementation notes: amd64, waldorf, crypto_stream/chacha12

Computer: waldorf
Architecture: amd64
CPU ID: GenuineIntel-000106e5-bfebfbff
SUPERCOP version: 20160715
Operation: crypto_stream
Primitive: chacha12
TimeImplementationCompilerBenchmark dateSUPERCOP version
4532krovetz/vec128gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
4548moon/ssse3/64gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
4580dolbeau/amd64-avx2clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
4584krovetz/vec128gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
4848dolbeau/amd64-avx2gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
5056krovetz/vec128clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
5104dolbeau/amd64-avx2gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
5200moon/ssse3/64gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
5208moon/ssse3/64gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
5224amd64-ssse3gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
5232moon/ssse3/64clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
5248moon/ssse3/64gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
5292krovetz/vec128gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
5404amd64-ssse3gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
5408dolbeau/amd64-avx2gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
5408amd64-ssse3gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
5440moon/sse2/64gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
5480amd64-ssse3gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
5492e/amd64-xmm6gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
5556moon/sse2/64clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
5736moon/sse2/64gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
5756moon/sse2/64gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
5792moon/sse2/64gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
6088e/amd64-xmm6gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
6096e/amd64-xmm6gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
6120e/amd64-xmm6gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
6896dolbeau/amd64-avx2gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
7972krovetz/vec128gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
11628e/amd64-3gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
13460e/mergedclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
13468e/mergedgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
13616e/amd64-3gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
13664e/amd64-3clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
13688e/amd64-3gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
13932e/amd64-3gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
15596e/mergedgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
16052e/refgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
16292e/regsgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
16612e/mergedgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
17176e/mergedgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
19692e/regsgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
21400e/refclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
21608e/regsgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
21676e/regsclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
24480e/refgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
26228e/refgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
26716e/regsgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
30024e/refgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715

Test failure

Implementation: crypto_stream/chacha12/amd64-ssse3
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
error 111

Number of similar (compiler,implementation) pairs: 17, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments amd64-ssse3 e/amd64-xmm6 moon/avx/64 moon/avx2/64 moon/xop/64
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv moon/avx/64 moon/avx2/64 moon/xop/64
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv moon/avx/64 moon/avx2/64 moon/xop/64
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv moon/avx/64 moon/avx2/64 moon/xop/64
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv moon/avx/64 moon/avx2/64 moon/xop/64

Compiler output

Implementation: crypto_stream/chacha12/dolbeau/ppc-altivec
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
chacha.c: In file included from chacha.c:11:
chacha.c: /usr/include/clang/3.5.0/include/altivec.h:27:2: error: "AltiVec support not enabled"
chacha.c: #error "AltiVec support not enabled"
chacha.c: ^
chacha.c: /usr/include/clang/3.5.0/include/altivec.h:39:8: error: unknown type name 'vector'
chacha.c: static vector signed char __ATTRS_o_ai
chacha.c: ^
chacha.c: /usr/include/clang/3.5.0/include/altivec.h:39:15: error: expected identifier or '('
chacha.c: static vector signed char __ATTRS_o_ai
chacha.c: ^
chacha.c: /usr/include/clang/3.5.0/include/altivec.h:42:8: error: unknown type name 'vector'
chacha.c: static vector unsigned char __ATTRS_o_ai
chacha.c: ^
chacha.c: /usr/include/clang/3.5.0/include/altivec.h:42:15: error: expected identifier or '('
chacha.c: static vector unsigned char __ATTRS_o_ai
chacha.c: ^
chacha.c: /usr/include/clang/3.5.0/include/altivec.h:47:8: error: unknown type name 'vector'
chacha.c: static vector bool char __ATTRS_o_ai
chacha.c: ^
chacha.c: /usr/include/clang/3.5.0/include/altivec.h:47:19: error: expected ';' after top level declarator
chacha.c: static vector bool char __ATTRS_o_ai
chacha.c: ^
chacha.c: /usr/include/clang/3.5.0/include/altivec.h:48:10: error: unknown type name 'vector'
chacha.c: vec_perm(vector bool char __a, vector bool char __b, vector unsigned char __c);
chacha.c: ^
chacha.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments dolbeau/ppc-altivec

Compiler output

Implementation: crypto_stream/chacha12/dolbeau/mipsel-msa
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
chacha.c: In file included from chacha.c:11:
chacha.c: /usr/include/clang/3.5.0/include/arm_neon.h:28:2: error: "NEON support not enabled"
chacha.c: #error "NEON support not enabled"
chacha.c: ^
chacha.c: /usr/include/clang/3.5.0/include/arm_neon.h:48:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(8))) int8_t int8x8_t;
chacha.c: ^
chacha.c: /usr/include/clang/3.5.0/include/arm_neon.h:49:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(16))) int8_t int8x16_t;
chacha.c: ^
chacha.c: /usr/include/clang/3.5.0/include/arm_neon.h:50:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(4))) int16_t int16x4_t;
chacha.c: ^
chacha.c: /usr/include/clang/3.5.0/include/arm_neon.h:51:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(8))) int16_t int16x8_t;
chacha.c: ^
chacha.c: /usr/include/clang/3.5.0/include/arm_neon.h:52:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(2))) int32_t int32x2_t;
chacha.c: ^
chacha.c: /usr/include/clang/3.5.0/include/arm_neon.h:53:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(4))) int32_t int32x4_t;
chacha.c: ^
chacha.c: /usr/include/clang/3.5.0/include/arm_neon.h:54:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(1))) int64_t int64x1_t;
chacha.c: ^
chacha.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments dolbeau/mipsel-msa

Compiler output

Implementation: crypto_stream/chacha12/goll_gueron
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
stream.c: stream.c:126:2: error: -- Implementation supports only microarchitectures with support for Advanced Vector Extensions (AVX2 or AVX512).
stream.c: #error -- Implementation supports only microarchitectures with support for Advanced Vector Extensions (AVX2 or AVX512).
stream.c: ^
stream.c: 1 error generated.

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments goll_gueron

Compiler output

Implementation: crypto_stream/chacha12/krovetz/avx2
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
stream.c: stream.c:54:5: error: use of undeclared identifier '__m256i'
stream.c: __m256i v0,v1,v2,v3,v4,v5,v6,v7,v8,v9,v10,v11;
stream.c: ^
stream.c: stream.c:56:5: error: use of undeclared identifier '__m256i'
stream.c: __m256i s0 = _mm_broadcastsi128_si256((__m128i *)sigma);
stream.c: ^
stream.c: stream.c:60:5: error: use of undeclared identifier '__m256i'
stream.c: __m256i s1 = _mm256_loadu_si256((__m256i *)k);
stream.c: ^
stream.c: stream.c:61:5: error: use of undeclared identifier '__m256i'
stream.c: __m256i s2 = _mm256_permute2x128_si256(s1,s1,0x11);
stream.c: ^
stream.c: stream.c:62:5: error: use of undeclared identifier 's1'
stream.c: s1 = _mm256_permute2x128_si256(s1,s1,0x00);
stream.c: ^
stream.c: stream.c:62:10: warning: implicit declaration of function '_mm256_permute2x128_si256' is invalid in C99 [-Wimplicit-function-declaration]
stream.c: s1 = _mm256_permute2x128_si256(s1,s1,0x00);
stream.c: ^
stream.c: stream.c:62:36: error: use of undeclared identifier 's1'
stream.c: s1 = _mm256_permute2x128_si256(s1,s1,0x00);
stream.c: ^
stream.c: stream.c:62:39: error: use of undeclared identifier 's1'
stream.c: s1 = _mm256_permute2x128_si256(s1,s1,0x00);
stream.c: ^
stream.c: stream.c:63:5: error: use of undeclared identifier '__m256i'
stream.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments krovetz/avx2

Compiler output

Implementation: crypto_stream/chacha12/dolbeau/ppc-altivec
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
chacha.c: chacha.c:11:21: fatal error: altivec.h: No such file or directory
chacha.c: #include gt;
chacha.c: ^
chacha.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv dolbeau/ppc-altivec
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv dolbeau/ppc-altivec
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv dolbeau/ppc-altivec
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv dolbeau/ppc-altivec

Compiler output

Implementation: crypto_stream/chacha12/dolbeau/mipsel-msa
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
chacha.c: chacha.c:11:22: fatal error: arm_neon.h: No such file or directory
chacha.c: #include gt;
chacha.c: ^
chacha.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv dolbeau/mipsel-msa
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv dolbeau/mipsel-msa
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv dolbeau/mipsel-msa
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv dolbeau/mipsel-msa

Compiler output

Implementation: crypto_stream/chacha12/krovetz/avx2
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
stream.c: stream.c: In function 'crypto_stream_chacha12_krovetz_avx2_xor':
stream.c: stream.c:58:13: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
stream.c: __m256i s0 = _mm256_broadcastsi128_si256(*(__m128i *)sigma);
stream.c: ^
stream.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/immintrin.h:43:0,
stream.c: from stream.c:8:
stream.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/avx2intrin.h:933:1: error: inlining failed in call to always_inline '_mm256_broadcastsi128_si256': target specific option mismatch
stream.c: _mm256_broadcastsi128_si256 (__m128i __X)
stream.c: ^
stream.c: stream.c:58:13: error: called from here
stream.c: __m256i s0 = _mm256_broadcastsi128_si256(*(__m128i *)sigma);
stream.c: ^
stream.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/immintrin.h:41:0,
stream.c: from stream.c:8:
stream.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/avxintrin.h:890:1: error: inlining failed in call to always_inline '_mm256_loadu_si256': target specific option mismatch
stream.c: _mm256_loadu_si256 (__m256i const *__P)
stream.c: ^
stream.c: stream.c:60:13: error: called from here
stream.c: __m256i s1 = _mm256_loadu_si256((__m256i *)k);
stream.c: ^
stream.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/immintrin.h:43:0,
stream.c: from stream.c:8:
stream.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/avx2intrin.h:1066:1: error: inlining failed in call to always_inline '_mm256_permute2x128_si256': target specific option mismatch
stream.c: _mm256_permute2x128_si256 (__m256i __X, __m256i __Y, const int __M)
stream.c: ^
stream.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv krovetz/avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv krovetz/avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv krovetz/avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv krovetz/avx2

Compiler output

Implementation: crypto_stream/chacha12/goll_gueron
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
stream.c: stream.c:126:2: error: #error -- Implementation supports only microarchitectures with support for Advanced Vector Extensions (AVX2 or AVX512).
stream.c: #error -- Implementation supports only microarchitectures with support for Advanced Vector Extensions (AVX2 or AVX512).
stream.c: ^

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv goll_gueron
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv goll_gueron
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv goll_gueron
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv goll_gueron