Implementation notes: amd64, pluton1mn, crypto_hash/blake512

Computer: pluton1mn
Architecture: amd64
CPU ID: GenuineIntel-00050671-bfebfbff
SUPERCOP version: 20160806
Operation: crypto_hash
Primitive: blake512
TimeImplementationCompilerBenchmark dateSUPERCOP version
16184bswapgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016080620160731
16254regsgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016080620160731
16632bswapgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016080620160731
16646bswapgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016080620160731
16688regsgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016080620160731
16716regsgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016080620160731
16800sphlibgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016080620160731
16800sphlibgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016080620160731
16884bswapicc -xMIC-AVX512 -O3 -fomit-frame-pointer2016080620160731
16940bswapicc -xMIC-AVX512 -O2 -fomit-frame-pointer2016080620160731
17010sphlibgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016080620160731
17192sphlibicc -xMIC-AVX512 -O2 -fomit-frame-pointer2016080620160731
17220bswapgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016080620160731
17486sphlibicc -xMIC-AVX512 -O3 -fomit-frame-pointer2016080620160731
17738sphlibgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016080620160731
18872regsicc -xMIC-AVX512 -O2 -fomit-frame-pointer2016080620160731
18956regsicc -xMIC-AVX512 -O3 -fomit-frame-pointer2016080620160731
19348regsgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016080620160731
21378sphlib-smallicc -xMIC-AVX512 -O2 -fomit-frame-pointer2016080620160731
21714sphlib-smallgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016080620160731
21938sphlib-smallicc -xMIC-AVX512 -O3 -fomit-frame-pointer2016080620160731
22470reficc -xMIC-AVX512 -O2 -fomit-frame-pointer2016080620160731
22484avxiccgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016080620160731
22484avxiccgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016080620160731
22484avxiccicc -xMIC-AVX512 -O2 -fomit-frame-pointer2016080620160731
22484avxiccicc -xMIC-AVX512 -O3 -fomit-frame-pointer2016080620160731
22498avxiccgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016080620160731
22498avxiccgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016080620160731
22498sse41icc -xMIC-AVX512 -O2 -fomit-frame-pointer2016080620160731
22596sphlib-smallgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016080620160731
22708sse41icc -xMIC-AVX512 -O3 -fomit-frame-pointer2016080620160731
22960reficc -xMIC-AVX512 -O3 -fomit-frame-pointer2016080620160731
23240vect128icc -xMIC-AVX512 -O2 -fomit-frame-pointer2016080620160731
23394vect128icc -xMIC-AVX512 -O3 -fomit-frame-pointer2016080620160731
23408vect128-inplaceicc -xMIC-AVX512 -O2 -fomit-frame-pointer2016080620160731
23940vect128-inplaceicc -xMIC-AVX512 -O3 -fomit-frame-pointer2016080620160731
24444sse41gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016080620160731
24528sse41gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016080620160731
24738sse41gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016080620160731
24822vect128-inplacegcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016080620160731
24934vect128gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016080620160731
24948vect128-inplacegcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016080620160731
24948sse41gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016080620160731
25046vect128gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016080620160731
25060vect128gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016080620160731
25186vect128-inplacegcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016080620160731
25242sse2gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016080620160731
25312sse2icc -xMIC-AVX512 -O2 -fomit-frame-pointer2016080620160731
25410sse2gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016080620160731
25410vect128gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016080620160731
25480vect128-inplacegcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016080620160731
25508sse2icc -xMIC-AVX512 -O3 -fomit-frame-pointer2016080620160731
25844refgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016080620160731
25872sse2gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016080620160731
25886sse2gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016080620160731
26026refgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016080620160731
33978sphlib-smallgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016080620160731
34034sphlib-smallgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016080620160731
36190sse2sgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016080620160731
36414sse2sgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016080620160731
36414sse2sicc -xMIC-AVX512 -O2 -fomit-frame-pointer2016080620160731
36442sse2sgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016080620160731
36568sse2sgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016080620160731
37352refgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016080620160731
38402sse2sicc -xMIC-AVX512 -O3 -fomit-frame-pointer2016080620160731
38444ssse3gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016080620160731
38570ssse3gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016080620160731
38696ssse3gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016080620160731
38808ssse3gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016080620160731
39242ssse3icc -xMIC-AVX512 -O2 -fomit-frame-pointer2016080620160731
40740refgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016080620160731
41958ssse3icc -xMIC-AVX512 -O3 -fomit-frame-pointer2016080620160731
87766sandygcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016080620160731
88326sandyicc -xMIC-AVX512 -O3 -fomit-frame-pointer2016080620160731
89026sandyicc -xMIC-AVX512 -O2 -fomit-frame-pointer2016080620160731
90622sandygcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016080620160731
90762sandygcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016080620160731
90888sandygcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016080620160731

Compiler output

Implementation: crypto_hash/blake512/xop-2
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
hash.c: In file included from /usr/local/gcc-6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include/x86intrin.h:54:0,
hash.c: from hash.c:5:
hash.c: hash.c: In function 'blake512_compress':
hash.c: /usr/local/gcc-6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
hash.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
hash.c: ^~~~~~~~~~~~~
hash.c: In file included from hash.c:8:0:
hash.c: rounds.h:15:21: note: called from here
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
hash.c: hash.c:99:15: note: in expansion of macro 'BSWAP64'
hash.c: m.u128[7] = BSWAP64(m.u128[7]);
hash.c: ^~~~~~~
hash.c: In file included from /usr/local/gcc-6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include/x86intrin.h:54:0,
hash.c: from hash.c:5:
hash.c: /usr/local/gcc-6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
hash.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
hash.c: ^~~~~~~~~~~~~
hash.c: In file included from hash.c:8:0:
hash.c: rounds.h:15:21: note: called from here
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
hash.c: hash.c:98:15: note: in expansion of macro 'BSWAP64'
hash.c: m.u128[6] = BSWAP64(m.u128[6]);
hash.c: ^~~~~~~
hash.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv xop-2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv xop-2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv xop-2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv xop-2

Compiler output

Implementation: crypto_hash/blake512/xop
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
hash.c: In file included from /usr/local/gcc-6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include/x86intrin.h:54:0,
hash.c: from hash.c:5:
hash.c: hash.c: In function 'blake512_compress':
hash.c: /usr/local/gcc-6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include/xopintrin.h:266:1: error: inlining failed in call to always_inline '_mm_roti_epi64': target specific option mismatch
hash.c: _mm_roti_epi64(__m128i __A, const int __B)
hash.c: ^~~~~~~~~~~~~~
hash.c: In file included from hash.c:8:0:
hash.c: rounds.h:825:9: note: called from here
hash.c: row2h = _mm_roti_epi64(row2h, -11); \
hash.c: ^
hash.c: rounds.h:867:3: note: in expansion of macro 'G2'
hash.c: G2(row1l,row2l,row3l,row4l,row1h,row2h,row3h,row4h,b0,b1); \
hash.c: ^~
hash.c: hash.c:132:3: note: in expansion of macro 'ROUND'
hash.c: ROUND(15);
hash.c: ^~~~~
hash.c: In file included from /usr/local/gcc-6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include/x86intrin.h:54:0,
hash.c: from hash.c:5:
hash.c: /usr/local/gcc-6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include/xopintrin.h:266:1: error: inlining failed in call to always_inline '_mm_roti_epi64': target specific option mismatch
hash.c: _mm_roti_epi64(__m128i __A, const int __B)
hash.c: ^~~~~~~~~~~~~~
hash.c: In file included from hash.c:8:0:
hash.c: rounds.h:824:9: note: called from here
hash.c: row2l = _mm_roti_epi64(row2l, -11); \
hash.c: ^
hash.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv xop
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv xop
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv xop
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv xop

Compiler output

Implementation: crypto_hash/blake512/vect128-xop
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
vector.c: In file included from /usr/local/gcc-6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include/x86intrin.h:54:0,
vector.c: from vector.h:29,
vector.c: from vector.c:7:
vector.c: vector.c: In function 'round512':
vector.c: /usr/local/gcc-6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include/xopintrin.h:266:1: error: inlining failed in call to always_inline '_mm_roti_epi64': target specific option mismatch
vector.c: _mm_roti_epi64(__m128i __A, const int __B)
vector.c: ^~~~~~~~~~~~~~
vector.c: vector.c:745:8: note: called from here
vector.c: B1 = v64_rotate(B1, 64-11); \
vector.c:
vector.c: vector.c:756:36: note: in expansion of macro 'ROUND'
vector.c: ROUND(12); ROUND(13); ROUND(14); ROUND(15);
vector.c: ^~~~~
vector.c: In file included from /usr/local/gcc-6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include/x86intrin.h:54:0,
vector.c: from vector.h:29,
vector.c: from vector.c:7:
vector.c: /usr/local/gcc-6.1.0/lib/gcc/x86_64-pc-linux-gnu/6.1.0/include/xopintrin.h:266:1: error: inlining failed in call to always_inline '_mm_roti_epi64': target specific option mismatch
vector.c: _mm_roti_epi64(__m128i __A, const int __B)
vector.c: ^~~~~~~~~~~~~~
vector.c: vector.c:744:8: note: called from here
vector.c: B0 = v64_rotate(B0, 64-11); \
vector.c:
vector.c: vector.c:756:36: note: in expansion of macro 'ROUND'
vector.c: ROUND(12); ROUND(13); ROUND(14); ROUND(15);
vector.c: ^~~~~
vector.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv vect128-xop
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv vect128-xop
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv vect128-xop
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv vect128-xop

Compiler output

Implementation: crypto_hash/blake512/sse2
Compiler: icc -xMIC-AVX512 -O2 -fomit-frame-pointer
hash.c: hash.c(314): (col. 10) warning #13200: No EMMS instruction before return from function

Number of similar (compiler,implementation) pairs: 2, namely:
CompilerImplementations
icc -xMIC-AVX512 -O2 -fomit-frame-pointer sse2
icc -xMIC-AVX512 -O3 -fomit-frame-pointer sse2

Compiler output

Implementation: crypto_hash/blake512/sse2s
Compiler: icc -xMIC-AVX512 -O2 -fomit-frame-pointer
hash.c: hash.c(326): (col. 10) warning #13200: No EMMS instruction before return from function

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
icc -xMIC-AVX512 -O2 -fomit-frame-pointer sse2s ssse3
icc -xMIC-AVX512 -O3 -fomit-frame-pointer sse2s ssse3

Compiler output

Implementation: crypto_hash/blake512/xop
Compiler: icc -xMIC-AVX512 -O2 -fomit-frame-pointer
hash.c: hash.c(81): warning #266: function "_mm_perm_epi8" declared implicitly
hash.c: m0 = BSWAP64(m0);
hash.c: ^
hash.c:
hash.c: hash.c(81): error: a value of type "int" cannot be assigned to an entity of type "__m128i"
hash.c: m0 = BSWAP64(m0);
hash.c: ^
hash.c:
hash.c: hash.c(82): error: a value of type "int" cannot be assigned to an entity of type "__m128i"
hash.c: m1 = BSWAP64(m1);
hash.c: ^
hash.c:
hash.c: hash.c(83): error: a value of type "int" cannot be assigned to an entity of type "__m128i"
hash.c: m2 = BSWAP64(m2);
hash.c: ^
hash.c:
hash.c: hash.c(84): error: a value of type "int" cannot be assigned to an entity of type "__m128i"
hash.c: m3 = BSWAP64(m3);
hash.c: ^
hash.c:
hash.c: hash.c(85): error: a value of type "int" cannot be assigned to an entity of type "__m128i"
hash.c: m4 = BSWAP64(m4);
hash.c: ^
hash.c:
hash.c: hash.c(86): error: a value of type "int" cannot be assigned to an entity of type "__m128i"
hash.c: ...

Number of similar (compiler,implementation) pairs: 2, namely:
CompilerImplementations
icc -xMIC-AVX512 -O2 -fomit-frame-pointer xop
icc -xMIC-AVX512 -O3 -fomit-frame-pointer xop

Compiler output

Implementation: crypto_hash/blake512/xop-2
Compiler: icc -xMIC-AVX512 -O2 -fomit-frame-pointer
hash.c: hash.c(92): warning #266: function "_mm_perm_epi8" declared implicitly
hash.c: m.u128[0] = BSWAP64(m.u128[0]);
hash.c: ^
hash.c:
hash.c: hash.c(92): error: a value of type "int" cannot be assigned to an entity of type "__m128i"
hash.c: m.u128[0] = BSWAP64(m.u128[0]);
hash.c: ^
hash.c:
hash.c: hash.c(93): error: a value of type "int" cannot be assigned to an entity of type "__m128i"
hash.c: m.u128[1] = BSWAP64(m.u128[1]);
hash.c: ^
hash.c:
hash.c: hash.c(94): error: a value of type "int" cannot be assigned to an entity of type "__m128i"
hash.c: m.u128[2] = BSWAP64(m.u128[2]);
hash.c: ^
hash.c:
hash.c: hash.c(95): error: a value of type "int" cannot be assigned to an entity of type "__m128i"
hash.c: m.u128[3] = BSWAP64(m.u128[3]);
hash.c: ^
hash.c:
hash.c: hash.c(96): error: a value of type "int" cannot be assigned to an entity of type "__m128i"
hash.c: m.u128[4] = BSWAP64(m.u128[4]);
hash.c: ^
hash.c:
hash.c: hash.c(97): error: a value of type "int" cannot be assigned to an entity of type "__m128i"
hash.c: ...

Number of similar (compiler,implementation) pairs: 2, namely:
CompilerImplementations
icc -xMIC-AVX512 -O2 -fomit-frame-pointer xop-2
icc -xMIC-AVX512 -O3 -fomit-frame-pointer xop-2

Compiler output

Implementation: crypto_hash/blake512/avxicc
Compiler: icc -xMIC-AVX512 -O2 -fomit-frame-pointer
try.c: ipo: remark #11035: Il version for crypto_hash_blake512.a (214006) does not match compiler's il version (349149), ignoring object file
try.c: ipo: remark #11035: Il version for crypto_hash_blake512.a (214006) does not match compiler's il version (349149), ignoring object file
measure.c: ipo: remark #11035: Il version for crypto_hash_blake512.a (214006) does not match compiler's il version (349149), ignoring object file

Number of similar (compiler,implementation) pairs: 2, namely:
CompilerImplementations
icc -xMIC-AVX512 -O2 -fomit-frame-pointer avxicc
icc -xMIC-AVX512 -O3 -fomit-frame-pointer avxicc

Compiler output

Implementation: crypto_hash/blake512/vect128-xop
Compiler: icc -xMIC-AVX512 -O2 -fomit-frame-pointer
vector.c: vector.c(646): warning #266: function "_mm_perm_epi8" declared implicitly
vector.c: v64 mm0 = v64_lswap(MM[0]), mm1 = v64_lswap(MM[1]);
vector.c: ^
vector.c:
vector.c: vector.c(646): error: a value of type "int" cannot be used to initialize an entity of type "v64"
vector.c: v64 mm0 = v64_lswap(MM[0]), mm1 = v64_lswap(MM[1]);
vector.c: ^
vector.c:
vector.c: vector.c(646): error: a value of type "int" cannot be used to initialize an entity of type "v64"
vector.c: v64 mm0 = v64_lswap(MM[0]), mm1 = v64_lswap(MM[1]);
vector.c: ^
vector.c:
vector.c: vector.c(647): error: a value of type "int" cannot be used to initialize an entity of type "v64"
vector.c: v64 mm2 = v64_lswap(MM[2]), mm3 = v64_lswap(MM[3]);
vector.c: ^
vector.c:
vector.c: vector.c(647): error: a value of type "int" cannot be used to initialize an entity of type "v64"
vector.c: v64 mm2 = v64_lswap(MM[2]), mm3 = v64_lswap(MM[3]);
vector.c: ^
vector.c:
vector.c: vector.c(648): error: a value of type "int" cannot be used to initialize an entity of type "v64"
vector.c: v64 mm4 = v64_lswap(MM[4]), mm5 = v64_lswap(MM[5]);
vector.c: ^
vector.c:
vector.c: vector.c(648): error: a value of type "int" cannot be used to initialize an entity of type "v64"
vector.c: ...

Number of similar (compiler,implementation) pairs: 2, namely:
CompilerImplementations
icc -xMIC-AVX512 -O2 -fomit-frame-pointer vect128-xop
icc -xMIC-AVX512 -O3 -fomit-frame-pointer vect128-xop