Implementation notes: amd64, h4atom, crypto_hash/blake512

Computer: h4atom
Architecture: amd64
CPU ID: GenuineIntel-000106ca-bfe9fbff
SUPERCOP version: 20160806
Operation: crypto_hash
Primitive: blake512
TimeImplementationCompilerBenchmark dateSUPERCOP version
21056bswapgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016081120160806
21216regsgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016081120160806
21240bswapgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016081120160806
21368bswapgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016081120160806
21408regsgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016081120160806
21576regsgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016081120160806
21912sphlibgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016081120160806
21976sphlibgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016081120160806
24392sphlibclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016081120160806
24592bswapclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016081120160806
25840sphlibgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016081120160806
28888regsclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016081120160806
29440sphlibgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016081120160806
29608bswapgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016081120160806
34904regsgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016081120160806
36504sphlib-smallgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016081120160806
38144sphlib-smallclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016081120160806
41760sphlib-smallgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016081120160806
42648refclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016081120160806
45104sphlib-smallgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016081120160806
45328sphlib-smallgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016081120160806
45864refgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016081120160806
47944sse2gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016081120160806
48032sse41clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016081120160806
48056ssse3clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016081120160806
48432sse2sclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016081120160806
49808sse2gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016081120160806
49928sse2gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016081120160806
50088sse2sgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016081120160806
50128sse2sgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016081120160806
50232sse2sgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016081120160806
50664sse2gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016081120160806
50912sse2clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016081120160806
51544refgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016081120160806
51904refgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016081120160806
54512vect128gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016081120160806
55096vect128-inplacegcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016081120160806
55400vect128gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016081120160806
55472refgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016081120160806
55576ssse3gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016081120160806
55592sse2sgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016081120160806
56080vect128gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016081120160806
56152vect128gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016081120160806
56400vect128-inplacegcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016081120160806
56696ssse3gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016081120160806
56768ssse3gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016081120160806
57760vect128-inplacegcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016081120160806
57872vect128-inplacegcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016081120160806
61656ssse3gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016081120160806
90008sandygcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016081120160806
90056sandyclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016081120160806
91816sandygcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016081120160806
91960sandygcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016081120160806
95864sandygcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016081120160806

Test failure

Implementation: crypto_hash/blake512/avxicc
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
error 111

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv avxicc
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv avxicc
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv avxicc
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv avxicc

Compiler output

Implementation: crypto_hash/blake512/sse41
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
hash.c: In file included from hash.c:8:
hash.c: ./rounds.h:8:10: warning: '_mm_roti_epi64' macro redefined [-Wmacro-redefined]
hash.c: #define _mm_roti_epi64(x, c) \
hash.c: ^
hash.c: /usr/lib/llvm-3.8/bin/../lib/clang/3.8.0/include/xopintrin.h:249:9: note: previous definition is here
hash.c: #define _mm_roti_epi64(A, N) __extension__ ({ \
hash.c: ^
hash.c: 1 warning generated.

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments sse41

Compiler output

Implementation: crypto_hash/blake512/xop
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
hash.c: hash.c:81:8: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_compress' that is compiled without support for 'sse4a'
hash.c: m0 = BSWAP64(m0);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:82:8: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_compress' that is compiled without support for 'sse4a'
hash.c: m1 = BSWAP64(m1);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:83:8: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_compress' that is compiled without support for 'sse4a'
hash.c: m2 = BSWAP64(m2);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:84:8: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_compress' that is compiled without support for 'sse4a'
hash.c: m3 = BSWAP64(m3);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:85:8: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_compress' that is compiled without support for 'sse4a'
hash.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments xop

Compiler output

Implementation: crypto_hash/blake512/xop-2
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
hash.c: hash.c:92:15: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_compress' that is compiled without support for 'sse4a'
hash.c: m.u128[0] = BSWAP64(m.u128[0]);
hash.c: ^
hash.c: ./rounds.h:15:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:93:15: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_compress' that is compiled without support for 'sse4a'
hash.c: m.u128[1] = BSWAP64(m.u128[1]);
hash.c: ^
hash.c: ./rounds.h:15:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:94:15: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_compress' that is compiled without support for 'sse4a'
hash.c: m.u128[2] = BSWAP64(m.u128[2]);
hash.c: ^
hash.c: ./rounds.h:15:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:95:15: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_compress' that is compiled without support for 'sse4a'
hash.c: m.u128[3] = BSWAP64(m.u128[3]);
hash.c: ^
hash.c: ./rounds.h:15:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:96:15: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_compress' that is compiled without support for 'sse4a'
hash.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments xop-2

Compiler output

Implementation: crypto_hash/blake512/avxicc
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
hash.s: hash.s:395828:58: error: unexpected token in argument list
hash.s: vmovdqu xmm0, XMMWORD PTR .L_2il0floatpacket.13[rip] #244.3
hash.s: ^
hash.s: hash.s:395830:58: error: unexpected token in argument list
hash.s: vmovdqu xmm1, XMMWORD PTR .L_2il0floatpacket.14[rip] #244.3
hash.s: ^
hash.s: hash.s:395831:58: error: unexpected token in argument list
hash.s: vmovdqu xmm2, XMMWORD PTR .L_2il0floatpacket.15[rip] #244.3
hash.s: ^
hash.s: hash.s:395832:58: error: unexpected token in argument list
hash.s: vmovdqu xmm3, XMMWORD PTR .L_2il0floatpacket.16[rip] #244.3
hash.s: ^
hash.s: hash.s:395915:31: error: unexpected token in argument list
hash.s: mov esi, offset flat: padding.0 #246.3
hash.s: ^
hash.s: hash.s:395927:31: error: unexpected token in argument list
hash.s: mov esi, offset flat: padding.0 #246.3
hash.s: ^
hash.s: hash.s:395933:31: error: unexpected token in argument list
hash.s: mov esi, offset flat: padding.0+1 #246.3
hash.s: ^
hash.s: hash.s:395981:58: error: unexpected token in argument list
hash.s: vmovdqu xmm7, XMMWORD PTR .L_2il0floatpacket.17[rip] #246.3
hash.s: ^
hash.s: hash.s:396067:59: error: unexpected token in argument list
hash.s: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments avxicc

Compiler output

Implementation: crypto_hash/blake512/vect128-xop
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
vector.c: vector.c:646:13: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'round512' that is compiled without support for 'sse4a'
vector.c: v64 mm0 = v64_lswap(MM[0]), mm1 = v64_lswap(MM[1]);
vector.c: ^
vector.c: ./vector.h:153:27: note: expanded from macro 'v64_lswap'
vector.c: #define v64_lswap(x) V864(vector_shuffle(V648(x), v64_swap_endianness.v8))
vector.c: ^
vector.c: ./vector.h:64:29: note: expanded from macro 'vector_shuffle'
vector.c: #define vector_shuffle(x,s) _mm_perm_epi8(x,x, s)
vector.c: ^
vector.c: vector.c:646:37: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'round512' that is compiled without support for 'sse4a'
vector.c: v64 mm0 = v64_lswap(MM[0]), mm1 = v64_lswap(MM[1]);
vector.c: ^
vector.c: ./vector.h:153:27: note: expanded from macro 'v64_lswap'
vector.c: #define v64_lswap(x) V864(vector_shuffle(V648(x), v64_swap_endianness.v8))
vector.c: ^
vector.c: ./vector.h:64:29: note: expanded from macro 'vector_shuffle'
vector.c: #define vector_shuffle(x,s) _mm_perm_epi8(x,x, s)
vector.c: ^
vector.c: vector.c:647:13: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'round512' that is compiled without support for 'sse4a'
vector.c: v64 mm2 = v64_lswap(MM[2]), mm3 = v64_lswap(MM[3]);
vector.c: ^
vector.c: ./vector.h:153:27: note: expanded from macro 'v64_lswap'
vector.c: #define v64_lswap(x) V864(vector_shuffle(V648(x), v64_swap_endianness.v8))
vector.c: ^
vector.c: ./vector.h:64:29: note: expanded from macro 'vector_shuffle'
vector.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments vect128-xop

Compiler output

Implementation: crypto_hash/blake512/vect128
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
vector.c: vector.c:753:3: error: use of unknown builtin '__builtin_ia32_punpcklqdq128' [-Wimplicit-function-declaration]
vector.c: ROUND( 0); ROUND( 1); ROUND( 2); ROUND( 3);
vector.c: ^
vector.c: vector.c:670:5: note: expanded from macro 'ROUND'
vector.c: PERM(i); \
vector.c: ^
vector.c: ./perm512-m.h:1:17: note: expanded from macro 'PERM'
vector.c: #define PERM(i) XCAT(PERM_512_INPLACE_,i)
vector.c: ^
vector.c: ./vector.h:6:19: note: expanded from macro 'XCAT'
vector.c: #define XCAT(x,y) CAT(x,y)
vector.c: ^
vector.c: note: (skipping 1 expansions in backtrace; use -fmacro-backtrace-limit=0 to see all)
vector.c: gt;:43:1: note: expanded from here
vector.c: PERM_512_INPLACE_0
vector.c: ^
vector.c: ./perm512-m.h:4:10: note: expanded from macro 'PERM_512_INPLACE_0'
vector.c: m0 = v64_interleavel(mm0, mm1); \
vector.c: ^
vector.c: ./vector.h:97:27: note: expanded from macro 'v64_interleavel'
vector.c: #define v64_interleavel __builtin_ia32_punpcklqdq128
vector.c: ^
vector.c: vector.c:753:3: error: assigning to 'v64' (aka 'v2di') from incompatible type 'int'
vector.c: ROUND( 0); ROUND( 1); ROUND( 2); ROUND( 3);
vector.c: ^~~~~~~~~
vector.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments vect128

Compiler output

Implementation: crypto_hash/blake512/vect128-inplace
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
vector.c: vector.c:753:3: error: use of unknown builtin '__builtin_ia32_punpcklqdq128' [-Wimplicit-function-declaration]
vector.c: ROUND( 0); ROUND( 1); ROUND( 2); ROUND( 3);
vector.c: ^
vector.c: vector.c:670:5: note: expanded from macro 'ROUND'
vector.c: PERM(i); \
vector.c: ^
vector.c: ./perm512.h:1:17: note: expanded from macro 'PERM'
vector.c: #define PERM(i) XCAT(PERM_512_,i)
vector.c: ^
vector.c: ./vector.h:6:19: note: expanded from macro 'XCAT'
vector.c: #define XCAT(x,y) CAT(x,y)
vector.c: ^
vector.c: note: (skipping 1 expansions in backtrace; use -fmacro-backtrace-limit=0 to see all)
vector.c: gt;:43:1: note: expanded from here
vector.c: PERM_512_0
vector.c: ^
vector.c: ./perm512.h:6:10: note: expanded from macro 'PERM_512_0'
vector.c: m0 = v64_interleavel(mm0, mm1); \
vector.c: ^
vector.c: ./vector.h:97:27: note: expanded from macro 'v64_interleavel'
vector.c: #define v64_interleavel __builtin_ia32_punpcklqdq128
vector.c: ^
vector.c: vector.c:753:3: error: assigning to 'v64' (aka 'v2di') from incompatible type 'int'
vector.c: ROUND( 0); ROUND( 1); ROUND( 2); ROUND( 3);
vector.c: ^~~~~~~~~
vector.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments vect128-inplace

Compiler output

Implementation: crypto_hash/blake512/sse41
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/x86intrin.h:41:0,
hash.c: from hash.c:5:
hash.c: hash.c: In function 'blake512_compress':
hash.c: /usr/lib/gcc/x86_64-linux-gnu/5/include/smmintrin.h:166:1: error: inlining failed in call to always_inline '_mm_blend_epi16': target specific option mismatch
hash.c: _mm_blend_epi16 (__m128i __X, __m128i __Y, const int __M)
hash.c: ^
hash.c: In file included from hash.c:8:0:
hash.c: rounds.h:779:4: error: called from here
hash.c: t2 = _mm_blend_epi16(m7, m4, 0xF0); \
hash.c: ^
hash.c: rounds.h:866:3: note: in expansion of macro 'LOAD_MSG_15_4'
hash.c: LOAD_MSG_ ##r ##_4(b0, b1); \
hash.c: ^
hash.c: hash.c:132:3: note: in expansion of macro 'ROUND'
hash.c: ROUND(15);
hash.c: ^
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/x86intrin.h:41:0,
hash.c: from hash.c:5:
hash.c: /usr/lib/gcc/x86_64-linux-gnu/5/include/smmintrin.h:166:1: error: inlining failed in call to always_inline '_mm_blend_epi16': target specific option mismatch
hash.c: _mm_blend_epi16 (__m128i __X, __m128i __Y, const int __M)
hash.c: ^
hash.c: In file included from hash.c:8:0:
hash.c: rounds.h:764:4: error: called from here
hash.c: t0 = _mm_blend_epi16(m2, m3, 0xF0); \
hash.c: ^
hash.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv sse41
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv sse41
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv sse41
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv sse41

Compiler output

Implementation: crypto_hash/blake512/xop-2
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/x86intrin.h:52:0,
hash.c: from hash.c:5:
hash.c: hash.c: In function 'blake512_compress':
hash.c: /usr/lib/gcc/x86_64-linux-gnu/5/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
hash.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
hash.c: ^
hash.c: In file included from hash.c:8:0:
hash.c: rounds.h:15:21: error: called from here
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:99:15: note: in expansion of macro 'BSWAP64'
hash.c: m.u128[7] = BSWAP64(m.u128[7]);
hash.c: ^
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/x86intrin.h:52:0,
hash.c: from hash.c:5:
hash.c: /usr/lib/gcc/x86_64-linux-gnu/5/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
hash.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
hash.c: ^
hash.c: In file included from hash.c:8:0:
hash.c: rounds.h:15:21: error: called from here
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:98:15: note: in expansion of macro 'BSWAP64'
hash.c: m.u128[6] = BSWAP64(m.u128[6]);
hash.c: ^
hash.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv xop-2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv xop-2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv xop-2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv xop-2

Compiler output

Implementation: crypto_hash/blake512/xop
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/x86intrin.h:52:0,
hash.c: from hash.c:5:
hash.c: hash.c: In function 'blake512_compress':
hash.c: /usr/lib/gcc/x86_64-linux-gnu/5/include/xopintrin.h:266:1: error: inlining failed in call to always_inline '_mm_roti_epi64': target specific option mismatch
hash.c: _mm_roti_epi64(__m128i __A, const int __B)
hash.c: ^
hash.c: In file included from hash.c:8:0:
hash.c: rounds.h:825:9: error: called from here
hash.c: row2h = _mm_roti_epi64(row2h, -11); \
hash.c: ^
hash.c: rounds.h:867:3: note: in expansion of macro 'G2'
hash.c: G2(row1l,row2l,row3l,row4l,row1h,row2h,row3h,row4h,b0,b1); \
hash.c: ^
hash.c: hash.c:132:3: note: in expansion of macro 'ROUND'
hash.c: ROUND(15);
hash.c: ^
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/x86intrin.h:52:0,
hash.c: from hash.c:5:
hash.c: /usr/lib/gcc/x86_64-linux-gnu/5/include/xopintrin.h:266:1: error: inlining failed in call to always_inline '_mm_roti_epi64': target specific option mismatch
hash.c: _mm_roti_epi64(__m128i __A, const int __B)
hash.c: ^
hash.c: In file included from hash.c:8:0:
hash.c: rounds.h:824:9: error: called from here
hash.c: row2l = _mm_roti_epi64(row2l, -11); \
hash.c: ^
hash.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv xop
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv xop
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv xop
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv xop

Compiler output

Implementation: crypto_hash/blake512/vect128-xop
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
vector.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/x86intrin.h:52:0,
vector.c: from vector.h:29,
vector.c: from vector.c:7:
vector.c: vector.c: In function 'round512':
vector.c: /usr/lib/gcc/x86_64-linux-gnu/5/include/xopintrin.h:266:1: error: inlining failed in call to always_inline '_mm_roti_epi64': target specific option mismatch
vector.c: _mm_roti_epi64(__m128i __A, const int __B)
vector.c: ^
vector.c: vector.c:745:8: error: called from here
vector.c: B1 = v64_rotate(B1, 64-11); \
vector.c: ^
vector.c: vector.c:756:36: note: in expansion of macro 'ROUND'
vector.c: ROUND(12); ROUND(13); ROUND(14); ROUND(15);
vector.c: ^
vector.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/x86intrin.h:52:0,
vector.c: from vector.h:29,
vector.c: from vector.c:7:
vector.c: /usr/lib/gcc/x86_64-linux-gnu/5/include/xopintrin.h:266:1: error: inlining failed in call to always_inline '_mm_roti_epi64': target specific option mismatch
vector.c: _mm_roti_epi64(__m128i __A, const int __B)
vector.c: ^
vector.c: vector.c:744:8: error: called from here
vector.c: B0 = v64_rotate(B0, 64-11); \
vector.c: ^
vector.c: vector.c:756:36: note: in expansion of macro 'ROUND'
vector.c: ROUND(12); ROUND(13); ROUND(14); ROUND(15);
vector.c: ^
vector.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv vect128-xop
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv vect128-xop
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv vect128-xop
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv vect128-xop