Implementation notes: amd64, zen3, crypto_aead/hs1sivhiv2

Computer: zen3
Architecture: amd64
CPU ID: AuthenticAMD-00a20f10-178bfbff
SUPERCOP version: 20211108
Operation: crypto_aead
Primitive: hs1sivhiv2
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
797313238 0 033920 876 1016T:dolbeau/amd64-avx2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022020620211108
876516359 0 045208 876 1048T:fasterclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022020620211108
878516215 0 044824 876 1048T:fasterclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022020620211108
920810338 0 031256 876 1016T:fasterclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022020620211108
925625616 0 054208 876 1048T:dolbeau/amd64-avx2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022020620211108
927725616 0 053968 876 1048T:dolbeau/amd64-avx2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022020620211108
938114294 0 035175 844 1080T:dolbeau/amd64-avx2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022020620211108
944213722 0 035560 852 1080T:dolbeau/amd64-avx2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022020620211108
948520510 0 044200 852 1080T:dolbeau/amd64-avx2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022020620211108
953312383 0 036096 852 1080T:fastergcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022020620211108
97779025 0 029774 868 1016T:fasterclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022020620211108
100049224 0 028495 828 1048T:fastergcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022020620211108
1002810436 0 032248 852 1080T:fastergcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022020620211108
104299827 0 030792 852 1080T:fastergcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022020620211108
1398911478 0 032200 876 1016T:dolbeau/amd64-sseclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022020620211108
1461321928 0 050656 876 1048T:dolbeau/amd64-sseclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022020620211108
1464413373 0 034239 844 1080T:dolbeau/amd64-ssegcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022020620211108
1469921928 0 050416 876 1048T:dolbeau/amd64-sseclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022020620211108
1514620490 0 044192 852 1080T:dolbeau/amd64-ssegcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022020620211108
1525512914 0 034736 852 1080T:dolbeau/amd64-ssegcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022020620211108
326178612 0 032288 876 1016T:refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022020620211108
3428313692 0 042376 876 1048T:refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022020620211108
3439114620 0 043576 876 1048T:refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022020620211108
400667656 0 031456 852 1080T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022020620211108
435745427 0 026272 876 1016T:refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022020620211108
439126504 0 028336 852 1080T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022020620211108
607276202 0 027168 852 1080T:refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022020620211108

Test failure

Implementation: T:dolbeau/amd64-avx2
Security model: timingleaks
Compiler: clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
error 111
crypto_aead_decrypt allows trivial forgeries

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx2 T:dolbeau/amd64-sse T:ref
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref

Test failure

Implementation: T:dolbeau/amd64-avx2
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE
error 111
crypto_aead_decrypt returns nonzero

Number of similar (compiler,implementation) pairs: 2, namely:
CompilerImplementations
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx2 T:dolbeau/amd64-sse

Test failure

Implementation: T:faster
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
error 111

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:faster

Compiler output

Implementation: T:dolbeau/amd64-avx2
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
encrypt.c: encrypt.c:90:2: error: "This code requires AVX2 to work"
encrypt.c: #error "This code requires AVX2 to work"
encrypt.c: ^
encrypt.c: 1 error generated.

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx2

Compiler output

Implementation: T:dolbeau/amd64-avx2
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
encrypt.c: encrypt.c:85: warning: "_bswap64" redefined
encrypt.c: 85 | #define _bswap64(a) __builtin_bswap64(a)
encrypt.c: |
encrypt.c: In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/11.1.0/include/x86gprintrin.h:27,
encrypt.c: from /usr/lib/gcc/x86_64-pc-linux-gnu/11.1.0/include/immintrin.h:27,
encrypt.c: from encrypt.c:54:
encrypt.c: /usr/lib/gcc/x86_64-pc-linux-gnu/11.1.0/include/ia32intrin.h:263: note: this is the location of the previous definition
encrypt.c: 263 | #define _bswap64(a) __bswapq(a)
encrypt.c: |
encrypt.c: encrypt.c:86: warning: "_bswap" redefined
encrypt.c: 86 | #define _bswap(a) __builtin_bswap(a)
encrypt.c: |
encrypt.c: In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/11.1.0/include/x86gprintrin.h:27,
encrypt.c: from /usr/lib/gcc/x86_64-pc-linux-gnu/11.1.0/include/immintrin.h:27,
encrypt.c: from encrypt.c:54:
encrypt.c: /usr/lib/gcc/x86_64-pc-linux-gnu/11.1.0/include/ia32intrin.h:297: note: this is the location of the previous definition
encrypt.c: 297 | #define _bswap(a) __bswapd(a)
encrypt.c: |

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx2

Compiler output

Implementation: T:dolbeau/amd64-avx512
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
encrypt.c: encrypt.c:90:2: error: "This code requires AVX512F to work"
encrypt.c: #error "This code requires AVX512F to work"
encrypt.c: ^
encrypt.c: encrypt.c:330:20: error: conflicting types for '_mm512_reduce_add_epi64'
encrypt.c: unsigned long long _mm512_reduce_add_epi64 (__m512i a) {
encrypt.c: ^
encrypt.c: /usr/lib/clang/13.0.0/include/avx512fintrin.h:9314:51: note: previous definition is here
encrypt.c: static __inline__ long long __DEFAULT_FN_ATTRS512 _mm512_reduce_add_epi64(__m512i __W) {
encrypt.c: ^
encrypt.c: encrypt.c:343:21: error: invalid input size for constraint 'Yz'
encrypt.c: : [a] "Yz" (a)
encrypt.c: ^
encrypt.c: 3 errors generated.

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx512
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx512
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx512
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx512
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx512

Compiler output

Implementation: T:dolbeau/amd64-avx512
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
encrypt.c: encrypt.c:85: warning: "_bswap64" redefined
encrypt.c: 85 | #define _bswap64(a) __builtin_bswap64(a)
encrypt.c: |
encrypt.c: In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/11.1.0/include/x86gprintrin.h:27,
encrypt.c: from /usr/lib/gcc/x86_64-pc-linux-gnu/11.1.0/include/immintrin.h:27,
encrypt.c: from encrypt.c:54:
encrypt.c: /usr/lib/gcc/x86_64-pc-linux-gnu/11.1.0/include/ia32intrin.h:263: note: this is the location of the previous definition
encrypt.c: 263 | #define _bswap64(a) __bswapq(a)
encrypt.c: |
encrypt.c: encrypt.c:86: warning: "_bswap" redefined
encrypt.c: 86 | #define _bswap(a) __builtin_bswap(a)
encrypt.c: |
encrypt.c: In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/11.1.0/include/x86gprintrin.h:27,
encrypt.c: from /usr/lib/gcc/x86_64-pc-linux-gnu/11.1.0/include/immintrin.h:27,
encrypt.c: from encrypt.c:54:
encrypt.c: /usr/lib/gcc/x86_64-pc-linux-gnu/11.1.0/include/ia32intrin.h:297: note: this is the location of the previous definition
encrypt.c: 297 | #define _bswap(a) __bswapd(a)
encrypt.c: |
encrypt.c: encrypt.c:90:2: error: #error "This code requires AVX512F to work"
encrypt.c: 90 | #error "This code requires AVX512F to work"
encrypt.c: | ^~~~~
encrypt.c: encrypt.c:330:20: error: conflicting types for ‘_mm512_reduce_add_epi64’; have ‘long long unsigned int(__m512i)’ {aka ‘long long unsigned int(__vector(8) long long int)’}
encrypt.c: 330 | unsigned long long _mm512_reduce_add_epi64 (__m512i a) {
encrypt.c: | ^~~~~~~~~~~~~~~~~~~~~~~
encrypt.c: In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/11.1.0/include/immintrin.h:49,
encrypt.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx512
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx512
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx512
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx512

Compiler output

Implementation: T:dolbeau/amd64-sse
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
encrypt.c: In file included from encrypt.c:192:
encrypt.c: ./c256.h:98:7: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'chacha_noxor368' that is compiled without support for 'ssse3'
encrypt.c: VEC4_QUARTERROUND( 0, 4, 8,12);
encrypt.c: ^
encrypt.c: ./c256.h:17:36: note: expanded from macro 'VEC4_QUARTERROUND'
encrypt.c: #define VEC4_QUARTERROUND(a,b,c,d) VEC4_QUARTERROUND_SHUFFLE(a,b,c,d)
encrypt.c: ^
encrypt.c: ./c256.h:12:86: note: expanded from macro 'VEC4_QUARTERROUND_SHUFFLE'
encrypt.c: x_##a = _mm_add_epi32(x_##a, x_##b); t_##a = _mm_xor_si128(x_##d, x_##a); x_##d = _mm_shuffle_epi8(t_##a, rot16); \
encrypt.c: ^
encrypt.c: ./c256.h:98:7: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'chacha_noxor368' that is compiled without support for 'ssse3'
encrypt.c: ./c256.h:17:36: note: expanded from macro 'VEC4_QUARTERROUND'
encrypt.c: #define VEC4_QUARTERROUND(a,b,c,d) VEC4_QUARTERROUND_SHUFFLE(a,b,c,d)
encrypt.c: ^
encrypt.c: ./c256.h:14:86: note: expanded from macro 'VEC4_QUARTERROUND_SHUFFLE'
encrypt.c: x_##a = _mm_add_epi32(x_##a, x_##b); t_##a = _mm_xor_si128(x_##d, x_##a); x_##d = _mm_shuffle_epi8(t_##a, rot8); \
encrypt.c: ^
encrypt.c: ./c256.h:99:7: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'chacha_noxor368' that is compiled without support for 'ssse3'
encrypt.c: VEC4_QUARTERROUND( 1, 5, 9,13);
encrypt.c: ^
encrypt.c: ./c256.h:17:36: note: expanded from macro 'VEC4_QUARTERROUND'
encrypt.c: #define VEC4_QUARTERROUND(a,b,c,d) VEC4_QUARTERROUND_SHUFFLE(a,b,c,d)
encrypt.c: ^
encrypt.c: ./c256.h:12:86: note: expanded from macro 'VEC4_QUARTERROUND_SHUFFLE'
encrypt.c: x_##a = _mm_add_epi32(x_##a, x_##b); t_##a = _mm_xor_si128(x_##d, x_##a); x_##d = _mm_shuffle_epi8(t_##a, rot16); \
encrypt.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-sse