Implementation notes: amd64, par, crypto_aead/hs1sivhiv2

Computer: par
Architecture: amd64
CPU ID: GenuineIntel-000406c3-bfebfbff
SUPERCOP version: 20161026
Operation: crypto_aead
Primitive: hs1sivhiv2

Time	Implementation	Compiler	Benchmark date	SUPERCOP version
67660	`dolbeau/amd64-sse`	`gcc -funroll-loops -march=native -mcpu=native -O2`	20161214	20161026
68140	`dolbeau/amd64-sse`	`gcc -funroll-loops -march=native -mcpu=native -O3`	20161214	20161026
70600	`dolbeau/amd64-sse`	`gcc -funroll-loops -march=native -mcpu=native -Os`	20161214	20161026
70600	`dolbeau/amd64-sse`	`gcc -march=native -mcpu=native -Os`	20161214	20161026
73060	`faster`	`gcc -funroll-loops -march=native -mcpu=native -O3`	20161214	20161026
73060	`dolbeau/amd64-sse`	`gcc -march=native -mcpu=native -O2`	20161214	20161026
73080	`faster`	`gcc -march=native -mcpu=native -O3`	20161214	20161026
73160	`dolbeau/amd64-sse`	`gcc -march=native -mcpu=native -O3`	20161214	20161026
73380	`faster`	`gcc -funroll-loops -march=native -mcpu=native -O2`	20161214	20161026
73640	`faster`	`gcc -march=native -mcpu=native -O2`	20161214	20161026
77260	`faster`	`gcc -funroll-loops -march=native -mcpu=native -Os`	20161214	20161026
78040	`faster`	`gcc -march=native -mcpu=native -Os`	20161214	20161026
101120	`ref`	`gcc -funroll-loops -march=native -mcpu=native -O3`	20161214	20161026
102820	`ref`	`gcc -funroll-loops -march=native -mcpu=native -O2`	20161214	20161026
103900	`ref`	`gcc -march=native -mcpu=native -O3`	20161214	20161026
129160	`ref`	`gcc -march=native -mcpu=native -O2`	20161214	20161026
140600	`ref`	`gcc -funroll-loops -march=native -mcpu=native -Os`	20161214	20161026
141140	`ref`	`gcc -march=native -mcpu=native -Os`	20161214	20161026

Compiler output

Implementation: crypto_aead/hs1sivhiv2/dolbeau/amd64-avx2
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2

encrypt.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
encrypt.c: encrypt.c:90:2: error: #error "This code requires AVX2 to work"
encrypt.c: #error "This code requires AVX2 to work"
encrypt.c: ^~~~~
encrypt.c: In file included from encrypt.c:195:0:
encrypt.c: c368.h: In function 'chacha_noxor368':
encrypt.c: c368.h:110:11: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
encrypt.c: __m256i rot16 = _mm256_set_epi8(13,12,15,14,9,8,11,10,5,4,7,6,1,0,3,2,13,12,15,14,9,8,11,10,5,4,7,6,1,0,3,2);
encrypt.c: ^~~~~

Number of similar (compiler,implementation) pairs: 6, namely:

Compiler	Implementations
gcc -funroll-loops -march=native -mcpu=native -O2	dolbeau/amd64-avx2
gcc -funroll-loops -march=native -mcpu=native -O3	dolbeau/amd64-avx2
gcc -funroll-loops -march=native -mcpu=native -Os	dolbeau/amd64-avx2
gcc -march=native -mcpu=native -O2	dolbeau/amd64-avx2
gcc -march=native -mcpu=native -O3	dolbeau/amd64-avx2
gcc -march=native -mcpu=native -Os	dolbeau/amd64-avx2

Compiler output

Implementation: crypto_aead/hs1sivhiv2/dolbeau/amd64-avx512
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2

encrypt.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
encrypt.c: encrypt.c:90:2: error: #error "This code requires AVX512F to work"
encrypt.c: #error "This code requires AVX512F to work"
encrypt.c: ^~~~~
encrypt.c: encrypt.c: In function '_mm512_reduce_add_epi64':
encrypt.c: encrypt.c:329:20: note: The ABI for passing parameters with 64-byte alignment has changed in GCC 4.6
encrypt.c: unsigned long long _mm512_reduce_add_epi64 (__m512i a) {
encrypt.c: ^~~~~~~~~~~~~~~~~~~~~~~
encrypt.c: In file included from encrypt.c:195:0:
encrypt.c: c368.h: In function 'chacha_noxor368':
encrypt.c: c368.h:110:11: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
encrypt.c: __m256i rot16 = _mm256_set_epi8(13,12,15,14,9,8,11,10,5,4,7,6,1,0,3,2,13,12,15,14,9,8,11,10,5,4,7,6,1,0,3,2);
encrypt.c: ^~~~~
encrypt.c: encrypt.c: In function 'prf_hash2_3':
encrypt.c: encrypt.c:505:19: warning: AVX512F vector return without AVX512F enabled changes the ABI [-Wpsabi]
encrypt.c: __m512i kv0 = _mm512_loadu_si512((const __m512i*)(nhkey+ 0)); // 1
encrypt.c: ^~~

Number of similar (compiler,implementation) pairs: 6, namely:

Compiler	Implementations
gcc -funroll-loops -march=native -mcpu=native -O2	dolbeau/amd64-avx512
gcc -funroll-loops -march=native -mcpu=native -O3	dolbeau/amd64-avx512
gcc -funroll-loops -march=native -mcpu=native -Os	dolbeau/amd64-avx512
gcc -march=native -mcpu=native -O2	dolbeau/amd64-avx512
gcc -march=native -mcpu=native -O3	dolbeau/amd64-avx512
gcc -march=native -mcpu=native -Os	dolbeau/amd64-avx512

Compiler output

Implementation: crypto_aead/hs1sivhiv2/dolbeau/amd64-sse
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2

encrypt.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
measure.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 12, namely:

Compiler	Implementations
gcc -funroll-loops -march=native -mcpu=native -O2	dolbeau/amd64-sse ref
gcc -funroll-loops -march=native -mcpu=native -O3	dolbeau/amd64-sse ref
gcc -funroll-loops -march=native -mcpu=native -Os	dolbeau/amd64-sse ref
gcc -march=native -mcpu=native -O2	dolbeau/amd64-sse ref
gcc -march=native -mcpu=native -O3	dolbeau/amd64-sse ref
gcc -march=native -mcpu=native -Os	dolbeau/amd64-sse ref

Compiler output

Implementation: crypto_aead/hs1sivhiv2/faster
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2

hs1.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
chacha_moon.S: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
measure.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 6, namely:

Compiler	Implementations
gcc -funroll-loops -march=native -mcpu=native -O2	faster
gcc -funroll-loops -march=native -mcpu=native -O3	faster
gcc -funroll-loops -march=native -mcpu=native -Os	faster
gcc -march=native -mcpu=native -O2	faster
gcc -march=native -mcpu=native -O3	faster
gcc -march=native -mcpu=native -Os	faster