Implementation notes: amd64, slide, crypto_hash/blake512

Computer: slide
Architecture: amd64
CPU ID: GenuineIntel-00040651-bfebfbff
SUPERCOP version: 20160806
Operation: crypto_hash
Primitive: blake512
TimeImplementationCompilerBenchmark dateSUPERCOP version
8904avxiccgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016090620160806
9360vect128gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016090620160806
9788sse41gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016090620160806
9876sse41gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016090620160806
9988avxiccgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016090620160806
10024avxiccgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016090620160806
10104sse41gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016090620160806
10308avxiccgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016090620160806
10432sse41gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016090620160806
10520vect128gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016090620160806
10592vect128gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016090620160806
10600sphlibgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016090620160806
10684vect128gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016090620160806
10708sphlibgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016090620160806
10852bswapgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016090620160806
10868vect128-inplacegcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016090620160806
10900bswapgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016090620160806
10912bswapgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016090620160806
10920sandygcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016090620160806
10960bswapgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016090620160806
10964vect128-inplacegcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016090620160806
10984vect128-inplacegcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016090620160806
10984sphlibgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016090620160806
11252vect128-inplacegcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016090620160806
11352sphlibgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016090620160806
11580sandygcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016090620160806
11608sandygcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016090620160806
11760sandygcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016090620160806
11952regsgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016090620160806
12048regsgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016090620160806
12072regsgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016090620160806
12084sphlib-smallgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016090620160806
12148regsgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016090620160806
12387sphlib-smallgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016090620160806
13252sse2gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016090620160806
13312sse2gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016090620160806
13344sse2gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016090620160806
13608sphlib-smallgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016090620160806
13724sse2gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016090620160806
13977refgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016090620160806
14232sphlib-smallgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016090620160806
14628ssse3gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016090620160806
14712ssse3gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016090620160806
14744ssse3gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016090620160806
14756ssse3gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016090620160806
14952refgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016090620160806
15420refgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016090620160806
15476sse2sgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016090620160806
15504sse2sgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016090620160806
15524sse2sgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016090620160806
15528sse2sgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016090620160806
15800refgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016090620160806

Compiler output

Implementation: crypto_hash/blake512/xop-2
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/x86intrin.h:52:0,
hash.c: from hash.c:5:
hash.c: hash.c: In function 'blake512_compress':
hash.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
hash.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
hash.c: ^
hash.c: In file included from hash.c:8:0:
hash.c: rounds.h:15:21: error: called from here
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:92:15: note: in expansion of macro 'BSWAP64'
hash.c: m.u128[0] = BSWAP64(m.u128[0]);
hash.c: ^
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/x86intrin.h:52:0,
hash.c: from hash.c:5:
hash.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
hash.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
hash.c: ^
hash.c: In file included from hash.c:8:0:
hash.c: rounds.h:15:21: error: called from here
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:93:15: note: in expansion of macro 'BSWAP64'
hash.c: m.u128[1] = BSWAP64(m.u128[1]);
hash.c: ^
hash.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv xop-2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv xop-2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv xop-2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv xop-2

Compiler output

Implementation: crypto_hash/blake512/xop
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/x86intrin.h:52:0,
hash.c: from hash.c:5:
hash.c: hash.c: In function 'blake512_compress':
hash.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
hash.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
hash.c: ^
hash.c: hash.c:81:6: error: called from here
hash.c: m0 = BSWAP64(m0);
hash.c: ^
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/x86intrin.h:52:0,
hash.c: from hash.c:5:
hash.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
hash.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
hash.c: ^
hash.c: hash.c:82:6: error: called from here
hash.c: m1 = BSWAP64(m1);
hash.c: ^
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/x86intrin.h:52:0,
hash.c: from hash.c:5:
hash.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
hash.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
hash.c: ^
hash.c: hash.c:83:6: error: called from here
hash.c: m2 = BSWAP64(m2);
hash.c: ^
hash.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv xop
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv xop
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv xop
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv xop

Compiler output

Implementation: crypto_hash/blake512/vect128-xop
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
vector.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/x86intrin.h:52:0,
vector.c: from vector.h:29,
vector.c: from vector.c:7:
vector.c: vector.c: In function 'round512':
vector.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
vector.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
vector.c: ^
vector.c: vector.c:646:7: error: called from here
vector.c: v64 mm0 = v64_lswap(MM[0]), mm1 = v64_lswap(MM[1]);
vector.c: ^
vector.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/x86intrin.h:52:0,
vector.c: from vector.h:29,
vector.c: from vector.c:7:
vector.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
vector.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
vector.c: ^
vector.c: vector.c:646:31: error: called from here
vector.c: v64 mm0 = v64_lswap(MM[0]), mm1 = v64_lswap(MM[1]);
vector.c: ^
vector.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/x86intrin.h:52:0,
vector.c: from vector.h:29,
vector.c: from vector.c:7:
vector.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
vector.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
vector.c: ^
vector.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv vect128-xop
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv vect128-xop
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv vect128-xop
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv vect128-xop