Implementation notes: amd64, waldorf, crypto_hash/blake512

Computer: waldorf
Architecture: amd64
CPU ID: GenuineIntel-000106e5-bfebfbff
SUPERCOP version: 20160715
Operation: crypto_hash
Primitive: blake512
TimeImplementationCompilerBenchmark dateSUPERCOP version
19628sse2sclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
20760ssse3clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
21228vect128gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
21920vect128gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
22052sse41clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
22268sse41gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
22912vect128-inplacegcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
23464vect128-inplacegcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
23672sphlibgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
24272sphlibgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
24520bswapgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
24520sse41gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
24628sse41gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
24664sse2clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
24664sse41gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
24732vect128gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
24912vect128gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
25588ssse3gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
25656sse2gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
25660vect128-inplacegcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
25848regsgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
26132sse2gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
26220ssse3gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
26520vect128-inplacegcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
26668sphlibgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
26756sse2sgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
26756ssse3gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
26784sse2sgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
26996bswapgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
27064bswapgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
27180sse2sgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
27260sphlib-smallgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
27272sphlibgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
28164bswapgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
28460sse2gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
28532sandygcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
28672sandygcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
29688regsgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
30408sandygcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
30564sandygcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
30620regsgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
30656regsgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
30992sphlibclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
31028ssse3gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
31048refgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
31052bswapclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
31096sse2sgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
31420sphlib-smallgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016071820160715
33068sse2gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
33332sphlib-smallclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
33448refgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
33496regsclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
33796sandyclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715
33888sphlib-smallgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016071820160715
33908refgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016071820160715
35092sphlib-smallgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
37284refgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016071820160715
37756refclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016071820160715

Test failure

Implementation: crypto_hash/blake512/avxicc
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
error 111

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv avxicc
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv avxicc
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv avxicc
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv avxicc

Compiler output

Implementation: crypto_hash/blake512/xop
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
hash.c: hash.c:81:8: warning: implicit declaration of function '_mm_perm_epi8' is invalid in C99 [-Wimplicit-function-declaration]
hash.c: m0 = BSWAP64(m0);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:81:6: error: assigning to '__m128i' (vector of 2 'long long' values) from incompatible type 'int'
hash.c: m0 = BSWAP64(m0);
hash.c: ^ ~~~~~~~~~~~
hash.c: hash.c:82:6: error: assigning to '__m128i' (vector of 2 'long long' values) from incompatible type 'int'
hash.c: m1 = BSWAP64(m1);
hash.c: ^ ~~~~~~~~~~~
hash.c: hash.c:83:6: error: assigning to '__m128i' (vector of 2 'long long' values) from incompatible type 'int'
hash.c: m2 = BSWAP64(m2);
hash.c: ^ ~~~~~~~~~~~
hash.c: hash.c:84:6: error: assigning to '__m128i' (vector of 2 'long long' values) from incompatible type 'int'
hash.c: m3 = BSWAP64(m3);
hash.c: ^ ~~~~~~~~~~~
hash.c: hash.c:85:6: error: assigning to '__m128i' (vector of 2 'long long' values) from incompatible type 'int'
hash.c: m4 = BSWAP64(m4);
hash.c: ^ ~~~~~~~~~~~
hash.c: hash.c:86:6: error: assigning to '__m128i' (vector of 2 'long long' values) from incompatible type 'int'
hash.c: m5 = BSWAP64(m5);
hash.c: ^ ~~~~~~~~~~~
hash.c: hash.c:87:6: error: assigning to '__m128i' (vector of 2 'long long' values) from incompatible type 'int'
hash.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments xop

Compiler output

Implementation: crypto_hash/blake512/xop-2
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
hash.c: hash.c:92:15: warning: implicit declaration of function '_mm_perm_epi8' is invalid in C99 [-Wimplicit-function-declaration]
hash.c: m.u128[0] = BSWAP64(m.u128[0]);
hash.c: ^
hash.c: ./rounds.h:15:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:92:13: error: assigning to '__m128i' (vector of 2 'long long' values) from incompatible type 'int'
hash.c: m.u128[0] = BSWAP64(m.u128[0]);
hash.c: ^ ~~~~~~~~~~~~~~~~~~
hash.c: hash.c:93:13: error: assigning to '__m128i' (vector of 2 'long long' values) from incompatible type 'int'
hash.c: m.u128[1] = BSWAP64(m.u128[1]);
hash.c: ^ ~~~~~~~~~~~~~~~~~~
hash.c: hash.c:94:13: error: assigning to '__m128i' (vector of 2 'long long' values) from incompatible type 'int'
hash.c: m.u128[2] = BSWAP64(m.u128[2]);
hash.c: ^ ~~~~~~~~~~~~~~~~~~
hash.c: hash.c:95:13: error: assigning to '__m128i' (vector of 2 'long long' values) from incompatible type 'int'
hash.c: m.u128[3] = BSWAP64(m.u128[3]);
hash.c: ^ ~~~~~~~~~~~~~~~~~~
hash.c: hash.c:96:13: error: assigning to '__m128i' (vector of 2 'long long' values) from incompatible type 'int'
hash.c: m.u128[4] = BSWAP64(m.u128[4]);
hash.c: ^ ~~~~~~~~~~~~~~~~~~
hash.c: hash.c:97:13: error: assigning to '__m128i' (vector of 2 'long long' values) from incompatible type 'int'
hash.c: m.u128[5] = BSWAP64(m.u128[5]);
hash.c: ^ ~~~~~~~~~~~~~~~~~~
hash.c: hash.c:98:13: error: assigning to '__m128i' (vector of 2 'long long' values) from incompatible type 'int'
hash.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments xop-2

Compiler output

Implementation: crypto_hash/blake512/avxicc
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
hash.s: hash.s:395828:58: error: unexpected token in argument list
hash.s: vmovdqu xmm0, XMMWORD PTR .L_2il0floatpacket.13[rip] #244.3
hash.s: ^
hash.s: hash.s:395830:58: error: unexpected token in argument list
hash.s: vmovdqu xmm1, XMMWORD PTR .L_2il0floatpacket.14[rip] #244.3
hash.s: ^
hash.s: hash.s:395831:58: error: unexpected token in argument list
hash.s: vmovdqu xmm2, XMMWORD PTR .L_2il0floatpacket.15[rip] #244.3
hash.s: ^
hash.s: hash.s:395832:58: error: unexpected token in argument list
hash.s: vmovdqu xmm3, XMMWORD PTR .L_2il0floatpacket.16[rip] #244.3
hash.s: ^
hash.s: hash.s:395915:31: error: unexpected token in argument list
hash.s: mov esi, offset flat: padding.0 #246.3
hash.s: ^
hash.s: hash.s:395927:31: error: unexpected token in argument list
hash.s: mov esi, offset flat: padding.0 #246.3
hash.s: ^
hash.s: hash.s:395933:31: error: unexpected token in argument list
hash.s: mov esi, offset flat: padding.0+1 #246.3
hash.s: ^
hash.s: hash.s:395981:58: error: unexpected token in argument list
hash.s: vmovdqu xmm7, XMMWORD PTR .L_2il0floatpacket.17[rip] #246.3
hash.s: ^
hash.s: hash.s:396067:59: error: unexpected token in argument list
hash.s: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments avxicc

Compiler output

Implementation: crypto_hash/blake512/vect128-xop
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
vector.c: vector.c:646:13: warning: implicit declaration of function '_mm_perm_epi8' is invalid in C99 [-Wimplicit-function-declaration]
vector.c: v64 mm0 = v64_lswap(MM[0]), mm1 = v64_lswap(MM[1]);
vector.c: ^
vector.c: ./vector.h:153:27: note: expanded from macro 'v64_lswap'
vector.c: #define v64_lswap(x) V864(vector_shuffle(V648(x), v64_swap_endianness.v8))
vector.c: ^
vector.c: ./vector.h:64:29: note: expanded from macro 'vector_shuffle'
vector.c: #define vector_shuffle(x,s) _mm_perm_epi8(x,x, s)
vector.c: ^
vector.c: ./vector.h:202:36: note: expanded from macro 'V864'
vector.c: #define V864(x) V3264((V1632(V816(x))))
vector.c: ^
vector.c: ./vector.h:42:19: note: expanded from macro 'V816'
vector.c: #define V816(x) (x)
vector.c: ^
vector.c: ./vector.h:40:19: note: expanded from macro 'V1632'
vector.c: #define V1632(x) (x)
vector.c: ^
vector.c: ./vector.h:38:19: note: expanded from macro 'V3264'
vector.c: #define V3264(x) (x)
vector.c: ^
vector.c: vector.c:646:7: error: initializing 'v64' (aka '__m128i') with an expression of incompatible type 'int'
vector.c: v64 mm0 = v64_lswap(MM[0]), mm1 = v64_lswap(MM[1]);
vector.c: ^ ~~~~~~~~~~~~~~~~
vector.c: vector.c:646:31: error: initializing 'v64' (aka '__m128i') with an expression of incompatible type 'int'
vector.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments vect128-xop

Compiler output

Implementation: crypto_hash/blake512/vect128
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
vector.c: vector.c:753:3: error: use of unknown builtin '__builtin_ia32_punpcklqdq128' [-Wimplicit-function-declaration]
vector.c: ROUND( 0); ROUND( 1); ROUND( 2); ROUND( 3);
vector.c: ^
vector.c: vector.c:670:5: note: expanded from macro 'ROUND'
vector.c: PERM(i); \
vector.c: ^
vector.c: ./perm512-m.h:1:17: note: expanded from macro 'PERM'
vector.c: #define PERM(i) XCAT(PERM_512_INPLACE_,i)
vector.c: ^
vector.c: ./vector.h:6:19: note: expanded from macro 'XCAT'
vector.c: #define XCAT(x,y) CAT(x,y)
vector.c: ^
vector.c: note: (skipping 1 expansions in backtrace; use -fmacro-backtrace-limit=0 to see all)
vector.c: gt;:43:1: note: expanded from here
vector.c: PERM_512_INPLACE_0
vector.c: ^
vector.c: ./perm512-m.h:4:10: note: expanded from macro 'PERM_512_INPLACE_0'
vector.c: m0 = v64_interleavel(mm0, mm1); \
vector.c: ^
vector.c: ./vector.h:97:27: note: expanded from macro 'v64_interleavel'
vector.c: #define v64_interleavel __builtin_ia32_punpcklqdq128
vector.c: ^
vector.c: vector.c:753:3: error: assigning to 'v64' (aka 'v2di') from incompatible type 'int'
vector.c: ROUND( 0); ROUND( 1); ROUND( 2); ROUND( 3);
vector.c: ^~~~~~~~~
vector.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments vect128

Compiler output

Implementation: crypto_hash/blake512/vect128-inplace
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
vector.c: vector.c:753:3: error: use of unknown builtin '__builtin_ia32_punpcklqdq128' [-Wimplicit-function-declaration]
vector.c: ROUND( 0); ROUND( 1); ROUND( 2); ROUND( 3);
vector.c: ^
vector.c: vector.c:670:5: note: expanded from macro 'ROUND'
vector.c: PERM(i); \
vector.c: ^
vector.c: ./perm512.h:1:17: note: expanded from macro 'PERM'
vector.c: #define PERM(i) XCAT(PERM_512_,i)
vector.c: ^
vector.c: ./vector.h:6:19: note: expanded from macro 'XCAT'
vector.c: #define XCAT(x,y) CAT(x,y)
vector.c: ^
vector.c: note: (skipping 1 expansions in backtrace; use -fmacro-backtrace-limit=0 to see all)
vector.c: gt;:43:1: note: expanded from here
vector.c: PERM_512_0
vector.c: ^
vector.c: ./perm512.h:6:10: note: expanded from macro 'PERM_512_0'
vector.c: m0 = v64_interleavel(mm0, mm1); \
vector.c: ^
vector.c: ./vector.h:97:27: note: expanded from macro 'v64_interleavel'
vector.c: #define v64_interleavel __builtin_ia32_punpcklqdq128
vector.c: ^
vector.c: vector.c:753:3: error: assigning to 'v64' (aka 'v2di') from incompatible type 'int'
vector.c: ROUND( 0); ROUND( 1); ROUND( 2); ROUND( 3);
vector.c: ^~~~~~~~~
vector.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments vect128-inplace

Compiler output

Implementation: crypto_hash/blake512/xop-2
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/x86intrin.h:52:0,
hash.c: from hash.c:5:
hash.c: hash.c: In function 'blake512_compress':
hash.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
hash.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
hash.c: ^
hash.c: In file included from hash.c:8:0:
hash.c: rounds.h:15:21: error: called from here
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:92:15: note: in expansion of macro 'BSWAP64'
hash.c: m.u128[0] = BSWAP64(m.u128[0]);
hash.c: ^
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/x86intrin.h:52:0,
hash.c: from hash.c:5:
hash.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
hash.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
hash.c: ^
hash.c: In file included from hash.c:8:0:
hash.c: rounds.h:15:21: error: called from here
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:93:15: note: in expansion of macro 'BSWAP64'
hash.c: m.u128[1] = BSWAP64(m.u128[1]);
hash.c: ^
hash.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv xop-2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv xop-2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv xop-2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv xop-2

Compiler output

Implementation: crypto_hash/blake512/xop
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/x86intrin.h:52:0,
hash.c: from hash.c:5:
hash.c: hash.c: In function 'blake512_compress':
hash.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
hash.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
hash.c: ^
hash.c: hash.c:81:6: error: called from here
hash.c: m0 = BSWAP64(m0);
hash.c: ^
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/x86intrin.h:52:0,
hash.c: from hash.c:5:
hash.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
hash.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
hash.c: ^
hash.c: hash.c:82:6: error: called from here
hash.c: m1 = BSWAP64(m1);
hash.c: ^
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/x86intrin.h:52:0,
hash.c: from hash.c:5:
hash.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
hash.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
hash.c: ^
hash.c: hash.c:83:6: error: called from here
hash.c: m2 = BSWAP64(m2);
hash.c: ^
hash.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv xop
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv xop
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv xop
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv xop

Compiler output

Implementation: crypto_hash/blake512/vect128-xop
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
vector.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/x86intrin.h:52:0,
vector.c: from vector.h:29,
vector.c: from vector.c:7:
vector.c: vector.c: In function 'round512':
vector.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
vector.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
vector.c: ^
vector.c: vector.c:646:7: error: called from here
vector.c: v64 mm0 = v64_lswap(MM[0]), mm1 = v64_lswap(MM[1]);
vector.c: ^
vector.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/x86intrin.h:52:0,
vector.c: from vector.h:29,
vector.c: from vector.c:7:
vector.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
vector.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
vector.c: ^
vector.c: vector.c:646:31: error: called from here
vector.c: v64 mm0 = v64_lswap(MM[0]), mm1 = v64_lswap(MM[1]);
vector.c: ^
vector.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/4.9/include/x86intrin.h:52:0,
vector.c: from vector.h:29,
vector.c: from vector.c:7:
vector.c: /usr/lib/gcc/x86_64-linux-gnu/4.9/include/xopintrin.h:212:1: error: inlining failed in call to always_inline '_mm_perm_epi8': target specific option mismatch
vector.c: _mm_perm_epi8(__m128i __A, __m128i __B, __m128i __C)
vector.c: ^
vector.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv vect128-xop
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv vect128-xop
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv vect128-xop
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv vect128-xop