Implementation notes: amd64, colossus7, crypto_stream/chacha8

Computer: colossus7
Architecture: amd64
CPU ID: AuthenticAMD-00830f10-178bfbff
SUPERCOP version: 20210125
Operation: crypto_stream
Primitive: chacha8
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
6305851 0 117542 744 808dolbeau/amd64-avx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
6305577 0 019860 752 800moon/avx2/64clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
6305577 0 019716 752 800moon/avx2/64clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
6305566 0 017262 744 800moon/avx2/64clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
6305577 0 019772 752 800moon/avx2/64clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
6526639 0 120908 752 808dolbeau/amd64-avx2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
6527599 0 121756 752 808dolbeau/amd64-avx2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
6526639 0 120908 752 808dolbeau/amd64-avx2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
6525577 0 019860 752 800moon/avx2/64clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
7423143 0 014830 744 800goll_gueronclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
7433552 0 017788 752 800goll_gueronclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
7433552 0 017788 752 800goll_gueronclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
7654560 0 018684 752 800goll_gueronclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
11473753 0 018036 752 800moon/ssse3/64clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
11473753 0 018036 752 800moon/ssse3/64clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
11483289 0 017572 752 800moon/avx/64clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
11483289 0 017428 752 800moon/avx/64clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
11703289 0 017572 752 800moon/avx/64clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
11703753 0 017948 752 800moon/ssse3/64clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
11933742 0 015438 744 800moon/ssse3/64clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
12153278 0 014974 744 800moon/avx/64clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
12373289 0 017484 752 800moon/avx/64clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
12603753 0 017892 752 800moon/ssse3/64clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
13953941 0 015646 744 800moon/sse2/64clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
14183952 0 018244 752 800moon/sse2/64clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
14403952 0 018156 752 800moon/sse2/64clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
14623952 0 018100 752 800moon/sse2/64clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
14854479 0 118716 752 808amd64-ssse3clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
15074479 0 118756 752 808amd64-ssse3clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
15074479 0 118756 752 808amd64-ssse3clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
15304479 0 118628 752 808amd64-ssse3clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
15304441 0 116158 744 808amd64-ssse3clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
17553952 0 018244 752 800moon/sse2/64clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
19584671 0 118940 752 808e/amd64-xmm6clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
20024671 0 118796 752 808e/amd64-xmm6clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
20024671 0 118940 752 808e/amd64-xmm6clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
20024671 0 118884 752 808e/amd64-xmm6clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
20034633 0 116342 744 808e/amd64-xmm6clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
27221742 0 113470 744 808e/mergedclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
27232855 0 116980 752 808e/mergedclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
28122311 0 116564 752 808e/mergedclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
28132311 0 116564 752 808e/mergedclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
29702770 0 117020 752 808e/amd64-3clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
29932770 0 117020 752 808e/amd64-3clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
30382770 0 116964 752 808e/amd64-3clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
30602702 0 114406 744 808e/amd64-3clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
31052770 0 116876 752 808e/amd64-3clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
34653527 0 117740 752 808e/mergedclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
37352231 0 116484 752 808e/regsclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
38932759 0 116884 752 808e/regsclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
39152935 0 117060 752 808e/refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
39602231 0 116484 752 808e/regsclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
40053359 0 117572 752 808e/regsclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
45232535 0 116804 752 808e/refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
45232535 0 116804 752 808e/refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
52651622 0 113342 744 808e/regsclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
54232831 0 117044 752 808e/refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
65251758 0 113478 744 808e/refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125

Test failure

Implementation: krovetz/vec128
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
error 111

Number of similar (compiler,implementation) pairs: 10, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/vec128
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/vec128
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/vec128
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/vec128
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/vec128
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE moon/xop/64
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE moon/xop/64
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE moon/xop/64
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE moon/xop/64
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE moon/xop/64

Compiler output

Implementation: dolbeau/amd64-avx2
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
chacha.c: In file included from chacha.c:103:
chacha.c: ./u4.h:122:7: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'crypto_stream_chacha8_dolbeau_amd64_avx2_constbranchindex_ECRYPT_encrypt_bytes' that is compiled without support for 'ssse3'
chacha.c: VEC4_QUARTERROUND( 0, 4, 8,12);
chacha.c: ^
chacha.c: ./u4.h:17:36: note: expanded from macro 'VEC4_QUARTERROUND'
chacha.c: #define VEC4_QUARTERROUND(a,b,c,d) VEC4_QUARTERROUND_SHUFFLE(a,b,c,d)
chacha.c: ^
chacha.c: ./u4.h:12:86: note: expanded from macro 'VEC4_QUARTERROUND_SHUFFLE'
chacha.c: x_##a = _mm_add_epi32(x_##a, x_##b); t_##a = _mm_xor_si128(x_##d, x_##a); x_##d = _mm_shuffle_epi8(t_##a, rot16); \
chacha.c: ^
chacha.c: ./u4.h:122:7: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'crypto_stream_chacha8_dolbeau_amd64_avx2_constbranchindex_ECRYPT_encrypt_bytes' that is compiled without support for 'ssse3'
chacha.c: ./u4.h:17:36: note: expanded from macro 'VEC4_QUARTERROUND'
chacha.c: #define VEC4_QUARTERROUND(a,b,c,d) VEC4_QUARTERROUND_SHUFFLE(a,b,c,d)
chacha.c: ^
chacha.c: ./u4.h:14:86: note: expanded from macro 'VEC4_QUARTERROUND_SHUFFLE'
chacha.c: x_##a = _mm_add_epi32(x_##a, x_##b); t_##a = _mm_xor_si128(x_##d, x_##a); x_##d = _mm_shuffle_epi8(t_##a, rot8); \
chacha.c: ^
chacha.c: ./u4.h:123:7: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'crypto_stream_chacha8_dolbeau_amd64_avx2_constbranchindex_ECRYPT_encrypt_bytes' that is compiled without support for 'ssse3'
chacha.c: VEC4_QUARTERROUND( 1, 5, 9,13);
chacha.c: ^
chacha.c: ./u4.h:17:36: note: expanded from macro 'VEC4_QUARTERROUND'
chacha.c: #define VEC4_QUARTERROUND(a,b,c,d) VEC4_QUARTERROUND_SHUFFLE(a,b,c,d)
chacha.c: ^
chacha.c: ./u4.h:12:86: note: expanded from macro 'VEC4_QUARTERROUND_SHUFFLE'
chacha.c: x_##a = _mm_add_epi32(x_##a, x_##b); t_##a = _mm_xor_si128(x_##d, x_##a); x_##d = _mm_shuffle_epi8(t_##a, rot16); \
chacha.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/amd64-avx2

Compiler output

Implementation: dolbeau/mipsel-msa
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
chacha.c: In file included from chacha.c:11:
chacha.c: /usr/lib/llvm-10/lib/clang/10.0.0/include/arm_neon.h:28:2: error: "NEON support not enabled"
chacha.c: #error "NEON support not enabled"
chacha.c: ^
chacha.c: /usr/lib/llvm-10/lib/clang/10.0.0/include/arm_neon.h:48:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(8))) int8_t int8x8_t;
chacha.c: ^
chacha.c: /usr/lib/llvm-10/lib/clang/10.0.0/include/arm_neon.h:49:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(16))) int8_t int8x16_t;
chacha.c: ^
chacha.c: /usr/lib/llvm-10/lib/clang/10.0.0/include/arm_neon.h:50:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(4))) int16_t int16x4_t;
chacha.c: ^
chacha.c: /usr/lib/llvm-10/lib/clang/10.0.0/include/arm_neon.h:51:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(8))) int16_t int16x8_t;
chacha.c: ^
chacha.c: /usr/lib/llvm-10/lib/clang/10.0.0/include/arm_neon.h:52:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(2))) int32_t int32x2_t;
chacha.c: ^
chacha.c: /usr/lib/llvm-10/lib/clang/10.0.0/include/arm_neon.h:53:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(4))) int32_t int32x4_t;
chacha.c: ^
chacha.c: /usr/lib/llvm-10/lib/clang/10.0.0/include/arm_neon.h:54:24: error: 'neon_vector_type' attribute is not supported for this target
chacha.c: typedef __attribute__((neon_vector_type(1))) int64_t int64x1_t;
chacha.c: ^
chacha.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/mipsel-msa
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/mipsel-msa
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/mipsel-msa
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/mipsel-msa
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/mipsel-msa

Compiler output

Implementation: goll_gueron
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
stream.c: stream.c:126:2: error: -- Implementation supports only microarchitectures with support for Advanced Vector Extensions (AVX2 or AVX512).
stream.c: #error -- Implementation supports only microarchitectures with support for Advanced Vector Extensions (AVX2 or AVX512).
stream.c: ^
stream.c: 1 error generated.

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE goll_gueron

Compiler output

Implementation: krovetz/avx2
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
stream.c: stream.c:56:18: warning: implicit declaration of function '_mm_broadcastsi128_si256' is invalid in C99 [-Wimplicit-function-declaration]
stream.c: __m256i s0 = _mm_broadcastsi128_si256((__m128i *)sigma);
stream.c: ^
stream.c: stream.c:56:13: error: initializing '__m256i' (vector of 4 'long long' values) with an expression of incompatible type 'int'
stream.c: __m256i s0 = _mm_broadcastsi128_si256((__m128i *)sigma);
stream.c: ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stream.c: 1 warning and 1 error generated.

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/avx2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/avx2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/avx2
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/avx2