Implementation notes: amd64, colossus7, crypto_stream/chacha20

Computer: colossus7
Architecture: amd64
CPU ID: AuthenticAMD-00830f10-178bfbff
SUPERCOP version: 20210125
Operation: crypto_stream
Primitive: chacha20
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
12155577 0 019860 752 800moon/avx2/64clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
12375577 0 019772 752 800moon/avx2/64clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
12605577 0 019860 752 800moon/avx2/64clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
12605577 0 019716 752 800moon/avx2/64clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
13055851 0 117542 744 808dolbeau/amd64-avx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
13285566 0 017262 744 800moon/avx2/64clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
13726639 0 120796 752 808dolbeau/amd64-avx2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
14176639 0 120908 752 808dolbeau/amd64-avx2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
14186639 0 120908 752 808dolbeau/amd64-avx2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
16653143 0 014830 744 800goll_gueronclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
17103552 0 017788 752 800goll_gueronclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
17103552 0 017788 752 800goll_gueronclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
17323552 0 017676 752 800goll_gueronclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
23403289 0 017572 752 800moon/avx/64clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
23623289 0 017572 752 800moon/avx/64clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
23623289 0 017484 752 800moon/avx/64clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
23853278 0 014974 744 800moon/avx/64clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
23853753 0 018036 752 800moon/ssse3/64clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
24083753 0 017948 752 800moon/ssse3/64clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
24523289 0 017428 752 800moon/avx/64clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
24983753 0 017892 752 800moon/ssse3/64clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
25423753 0 018036 752 800moon/ssse3/64clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
25873742 0 015438 744 800moon/ssse3/64clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
29474479 0 118716 752 808amd64-ssse3clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
29484441 0 116158 744 808amd64-ssse3clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
29704479 0 118756 752 808amd64-ssse3clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
29703952 0 018244 752 800moon/sse2/64clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
29923952 0 018244 752 800moon/sse2/64clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
29933952 0 018156 752 800moon/sse2/64clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
30154479 0 118628 752 808amd64-ssse3clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
30154479 0 118756 752 808amd64-ssse3clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
30833952 0 018100 752 800moon/sse2/64clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
31053941 0 015646 744 800moon/sse2/64clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
39154671 0 118796 752 808e/amd64-xmm6clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
40054633 0 116342 744 808e/amd64-xmm6clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
40724671 0 118884 752 808e/amd64-xmm6clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
40954671 0 118940 752 808e/amd64-xmm6clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
40954671 0 118940 752 808e/amd64-xmm6clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
64351742 0 113470 744 808e/mergedclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
64802702 0 114406 744 808e/amd64-3clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
65252311 0 116436 752 808e/mergedclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
68852311 0 116564 752 808e/mergedclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
68852311 0 116564 752 808e/mergedclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
69302770 0 116876 752 808e/amd64-3clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
70872770 0 116964 752 808e/amd64-3clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
71332770 0 117036 752 808e/amd64-3clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
71332770 0 117036 752 808e/amd64-3clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
77852231 0 116356 752 808e/regsclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
80102231 0 116484 752 808e/regsclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
83252231 0 116484 752 808e/regsclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
85502895 0 117108 752 808e/mergedclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
87072535 0 116660 752 808e/refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
90232719 0 116932 752 808e/regsclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
90452535 0 116804 752 808e/refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
90902535 0 116804 752 808e/refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
91131622 0 113342 744 808e/regsclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
117671758 0 113478 744 808e/refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
133882831 0 117044 752 808e/refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125

Test failure

Implementation: dolbeau/generic-gccsimd128
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
error 111

Number of similar (compiler,implementation) pairs: 20, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/generic-gccsimd128
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/generic-gccsimd128
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/generic-gccsimd128
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/generic-gccsimd128
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/generic-gccsimd128
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/generic-gccsimd256
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/generic-gccsimd256
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/generic-gccsimd256
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/generic-gccsimd256
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/generic-gccsimd256
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/vec128
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/vec128
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/vec128
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/vec128
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/vec128
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE moon/xop/64
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE moon/xop/64
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE moon/xop/64
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE moon/xop/64
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE moon/xop/64

Compiler output

Implementation: dolbeau/amd64-avx2
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
chacha.c: In file included from chacha.c:103:
chacha.c: ./u4.h:122:7: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'crypto_stream_chacha20_dolbeau_amd64_avx2_constbranchindex_ECRYPT_encrypt_bytes' that is compiled without support for 'ssse3'
chacha.c: VEC4_QUARTERROUND( 0, 4, 8,12);
chacha.c: ^
chacha.c: ./u4.h:17:36: note: expanded from macro 'VEC4_QUARTERROUND'
chacha.c: #define VEC4_QUARTERROUND(a,b,c,d) VEC4_QUARTERROUND_SHUFFLE(a,b,c,d)
chacha.c: ^
chacha.c: ./u4.h:12:86: note: expanded from macro 'VEC4_QUARTERROUND_SHUFFLE'
chacha.c: x_##a = _mm_add_epi32(x_##a, x_##b); t_##a = _mm_xor_si128(x_##d, x_##a); x_##d = _mm_shuffle_epi8(t_##a, rot16); \
chacha.c: ^
chacha.c: ./u4.h:122:7: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'crypto_stream_chacha20_dolbeau_amd64_avx2_constbranchindex_ECRYPT_encrypt_bytes' that is compiled without support for 'ssse3'
chacha.c: ./u4.h:17:36: note: expanded from macro 'VEC4_QUARTERROUND'
chacha.c: #define VEC4_QUARTERROUND(a,b,c,d) VEC4_QUARTERROUND_SHUFFLE(a,b,c,d)
chacha.c: ^
chacha.c: ./u4.h:14:86: note: expanded from macro 'VEC4_QUARTERROUND_SHUFFLE'
chacha.c: x_##a = _mm_add_epi32(x_##a, x_##b); t_##a = _mm_xor_si128(x_##d, x_##a); x_##d = _mm_shuffle_epi8(t_##a, rot8); \
chacha.c: ^
chacha.c: ./u4.h:123:7: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'crypto_stream_chacha20_dolbeau_amd64_avx2_constbranchindex_ECRYPT_encrypt_bytes' that is compiled without support for 'ssse3'
chacha.c: VEC4_QUARTERROUND( 1, 5, 9,13);
chacha.c: ^
chacha.c: ./u4.h:17:36: note: expanded from macro 'VEC4_QUARTERROUND'
chacha.c: #define VEC4_QUARTERROUND(a,b,c,d) VEC4_QUARTERROUND_SHUFFLE(a,b,c,d)
chacha.c: ^
chacha.c: ./u4.h:12:86: note: expanded from macro 'VEC4_QUARTERROUND_SHUFFLE'
chacha.c: x_##a = _mm_add_epi32(x_##a, x_##b); t_##a = _mm_xor_si128(x_##d, x_##a); x_##d = _mm_shuffle_epi8(t_##a, rot16); \
chacha.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/amd64-avx2

Compiler output

Implementation: goll_gueron
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
stream.c: stream.c:126:2: error: -- Implementation supports only microarchitectures with support for Advanced Vector Extensions (AVX2 or AVX512).
stream.c: #error -- Implementation supports only microarchitectures with support for Advanced Vector Extensions (AVX2 or AVX512).
stream.c: ^
stream.c: 1 error generated.

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE goll_gueron

Compiler output

Implementation: krovetz/avx2
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
stream.c: stream.c:56:18: warning: implicit declaration of function '_mm_broadcastsi128_si256' is invalid in C99 [-Wimplicit-function-declaration]
stream.c: __m256i s0 = _mm_broadcastsi128_si256((__m128i *)sigma);
stream.c: ^
stream.c: stream.c:56:13: error: initializing '__m256i' (vector of 4 'long long' values) with an expression of incompatible type 'int'
stream.c: __m256i s0 = _mm_broadcastsi128_si256((__m128i *)sigma);
stream.c: ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
stream.c: 1 warning and 1 error generated.

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/avx2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/avx2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/avx2
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE krovetz/avx2