Implementation notes: amd64, latour, crypto_aead/hs1sivv2

Computer: latour
Architecture: amd64
CPU ID: GenuineIntel-000006fb-bfebfbff
SUPERCOP version: 20201130
Operation: crypto_aead
Primitive: hs1sivv2
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
2172611541 0 031976 808 856T:dolbeau/amd64-sseclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
2173511541 0 031040 808 856T:dolbeau/amd64-sseclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
2176210791 0 029110 800 856T:dolbeau/amd64-sseclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
2181611541 0 031040 808 856T:dolbeau/amd64-sseclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
2387724265 0 044787 744 928T:dolbeau/amd64-ssegcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
2393124714 0 047444 752 928T:dolbeau/amd64-ssegcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
2401223525 0 043379 744 928T:dolbeau/amd64-ssegcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
245619433 0 028231 728 896T:dolbeau/amd64-ssegcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
398705523 0 025160 808 856T:refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
403747030 0 027488 808 856T:refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
403836958 0 027280 808 856T:refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
404015523 0 025160 808 856T:refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
421116383 0 029268 752 928T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
470884462 0 022950 800 856T:refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020083020200826
546215703 0 026379 744 928T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
596524530 0 023359 728 896T:refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826
670055843 0 025964 752 928T:refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020083020200826

Test failure

Implementation: T:faster
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
error 111

Number of similar (compiler,implementation) pairs: 9, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:faster
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:faster
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:faster
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:faster
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:faster
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:faster
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:faster
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:faster
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:faster

Compiler output

Implementation: T:dolbeau/amd64-avx2
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
encrypt.c: encrypt.c:90:2: error: "This code requires AVX2 to work"
encrypt.c: #error "This code requires AVX2 to work"
encrypt.c: ^
encrypt.c: 1 error generated.

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx2
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx2

Compiler output

Implementation: T:dolbeau/amd64-avx2
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
encrypt.c: encrypt.c:90:2: error: #error "This code requires AVX2 to work"
encrypt.c: #error "This code requires AVX2 to work"
encrypt.c: ^
encrypt.c: encrypt.c: In function 'prf_hash2_2':
encrypt.c: encrypt.c:425:19: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
encrypt.c: __m256i kv0 = _mm256_loadu_si256((const __m256i*)(nhkey+ 0)); // 1
encrypt.c: ^

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx2

Compiler output

Implementation: T:dolbeau/amd64-avx512
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
encrypt.c: encrypt.c:90:2: error: "This code requires AVX512F to work"
encrypt.c: #error "This code requires AVX512F to work"
encrypt.c: ^
encrypt.c: encrypt.c:335:15: error: invalid input constraint 'Yz' in asm
encrypt.c: : [a] "Yz" (a)
encrypt.c: ^
encrypt.c: encrypt.c:483:26: warning: implicit declaration of function '_mm512_loadu_si512' is invalid in C99 [-Wimplicit-function-declaration]
encrypt.c: __m512i kv0 = _mm512_loadu_si512((const __m512i*)(nhkey+ 0)); // 1
encrypt.c: ^
encrypt.c: encrypt.c:483:19: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
encrypt.c: __m512i kv0 = _mm512_loadu_si512((const __m512i*)(nhkey+ 0)); // 1
encrypt.c: ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
encrypt.c: encrypt.c:484:19: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
encrypt.c: __m512i kv4 = _mm512_loadu_si512((const __m512i*)(nhkey+ 4)); // 1
encrypt.c: ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
encrypt.c: encrypt.c:485:19: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
encrypt.c: __m512i kv8 = _mm512_loadu_si512((const __m512i*)(nhkey+ 8)); // 1, 2
encrypt.c: ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
encrypt.c: encrypt.c:486:19: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
encrypt.c: __m512i kv12 = _mm512_loadu_si512((const __m512i*)(nhkey+12)); // 1, 2
encrypt.c: ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
encrypt.c: encrypt.c:487:19: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
encrypt.c: __m512i kv16 = _mm512_loadu_si512((const __m512i*)(nhkey+16)); // 2
encrypt.c: ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
encrypt.c: encrypt.c:488:19: error: initializing '__m512i' (vector of 8 'long long' values) with an expression of incompatible type 'int'
encrypt.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx512
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx512
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx512
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx512
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-avx512

Compiler output

Implementation: T:dolbeau/amd64-avx512
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
encrypt.c: encrypt.c:90:2: error: #error "This code requires AVX512F to work"
encrypt.c: #error "This code requires AVX512F to work"
encrypt.c: ^
encrypt.c: encrypt.c: In function '_mm512_reduce_add_epi64':
encrypt.c: encrypt.c:322:20: note: The ABI for passing parameters with 64-byte alignment has changed in GCC 4.6
encrypt.c: unsigned long long _mm512_reduce_add_epi64 (__m512i a) {
encrypt.c: ^
encrypt.c: encrypt.c: In function 'prf_hash2_2':
encrypt.c: encrypt.c:483:19: warning: AVX512F vector return without AVX512F enabled changes the ABI [-Wpsabi]
encrypt.c: __m512i kv0 = _mm512_loadu_si512((const __m512i*)(nhkey+ 0)); // 1
encrypt.c: ^
encrypt.c: encrypt.c:502:19: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
encrypt.c: __m256i inv0 = _mm256_inserti128_si256(_mm256_castsi128_si256(inv0lo), inv0lo, 1);
encrypt.c: ^

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx512
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx512
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx512
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:dolbeau/amd64-avx512

Compiler output

Implementation: T:dolbeau/amd64-sse
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
encrypt.c: In file included from encrypt.c:190:
encrypt.c: ./c176.h:99:7: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'chacha_noxor176' that is compiled without support for 'ssse3'
encrypt.c: VEC4_QUARTERROUND( 0, 4, 8,12);
encrypt.c: ^
encrypt.c: ./c176.h:17:36: note: expanded from macro 'VEC4_QUARTERROUND'
encrypt.c: #define VEC4_QUARTERROUND(a,b,c,d) VEC4_QUARTERROUND_SHUFFLE(a,b,c,d)
encrypt.c: ^
encrypt.c: ./c176.h:12:86: note: expanded from macro 'VEC4_QUARTERROUND_SHUFFLE'
encrypt.c: x_##a = _mm_add_epi32(x_##a, x_##b); t_##a = _mm_xor_si128(x_##d, x_##a); x_##d = _mm_shuffle_epi8(t_##a, rot16); \
encrypt.c: ^
encrypt.c: ./c176.h:99:7: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'chacha_noxor176' that is compiled without support for 'ssse3'
encrypt.c: ./c176.h:17:36: note: expanded from macro 'VEC4_QUARTERROUND'
encrypt.c: #define VEC4_QUARTERROUND(a,b,c,d) VEC4_QUARTERROUND_SHUFFLE(a,b,c,d)
encrypt.c: ^
encrypt.c: ./c176.h:14:86: note: expanded from macro 'VEC4_QUARTERROUND_SHUFFLE'
encrypt.c: x_##a = _mm_add_epi32(x_##a, x_##b); t_##a = _mm_xor_si128(x_##d, x_##a); x_##d = _mm_shuffle_epi8(t_##a, rot8); \
encrypt.c: ^
encrypt.c: ./c176.h:100:7: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'chacha_noxor176' that is compiled without support for 'ssse3'
encrypt.c: VEC4_QUARTERROUND( 1, 5, 9,13);
encrypt.c: ^
encrypt.c: ./c176.h:17:36: note: expanded from macro 'VEC4_QUARTERROUND'
encrypt.c: #define VEC4_QUARTERROUND(a,b,c,d) VEC4_QUARTERROUND_SHUFFLE(a,b,c,d)
encrypt.c: ^
encrypt.c: ./c176.h:12:86: note: expanded from macro 'VEC4_QUARTERROUND_SHUFFLE'
encrypt.c: x_##a = _mm_add_epi32(x_##a, x_##b); t_##a = _mm_xor_si128(x_##d, x_##a); x_##d = _mm_shuffle_epi8(t_##a, rot16); \
encrypt.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:dolbeau/amd64-sse

Namespace violations

Implementation: T:ref
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
encrypt.o chacha T
encrypt.o chacha_ivsetup T
encrypt.o chacha_keysetup T
encrypt.o hs1 T
encrypt.o hs1_hash T
encrypt.o hs1siv_chacha256 T
encrypt.o hs1siv_decrypt T
encrypt.o hs1siv_encrypt T
encrypt.o hs1siv_subkeygen T
encrypt.o prf_hash2 T

Number of similar (compiler,implementation) pairs: 9, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref