Implementation notes: amd64, sand, crypto_hash/bblake512

Computer: sand
Architecture: amd64
CPU ID: GenuineIntel-000506e3-bfebfbff
SUPERCOP version: 20171218
Operation: crypto_hash
Primitive: bblake512

Compiler output

Implementation: crypto_hash/bblake512/xop
Compiler: cc
hash.c: hash.c:386:47: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_final' that is compiled without support for 'sse4a'
hash.c: _mm_storeu_si128((__m128i*)(digest + 0), BSWAP64(S->h[0]));
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:387:47: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_final' that is compiled without support for 'sse4a'
hash.c: _mm_storeu_si128((__m128i*)(digest + 16), BSWAP64(S->h[1]));
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:388:47: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_final' that is compiled without support for 'sse4a'
hash.c: _mm_storeu_si128((__m128i*)(digest + 32), BSWAP64(S->h[2]));
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:389:47: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_final' that is compiled without support for 'sse4a'
hash.c: _mm_storeu_si128((__m128i*)(digest + 48), BSWAP64(S->h[3]));
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: 4 errors generated.

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
cc xop

Compiler output

Implementation: crypto_hash/bblake512/xop
Compiler: clang -O3 -fomit-frame-pointer -Qunused-arguments
hash.c: hash.c:81:8: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_compress' that is compiled without support for 'sse4a'
hash.c: m0 = BSWAP64(m0);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:82:8: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_compress' that is compiled without support for 'sse4a'
hash.c: m1 = BSWAP64(m1);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:83:8: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_compress' that is compiled without support for 'sse4a'
hash.c: m2 = BSWAP64(m2);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:84:8: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_compress' that is compiled without support for 'sse4a'
hash.c: m3 = BSWAP64(m3);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:85:8: error: always_inline function '_mm_perm_epi8' requires target feature 'sse4a', but would be inlined into function 'blake512_compress' that is compiled without support for 'sse4a'
hash.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
clang -O3 -fomit-frame-pointer -Qunused-arguments xop
clang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments xop
clang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments xop
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments xop

Compiler output

Implementation: crypto_hash/bblake512/xop
Compiler: clang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments
hash.c: hash.c:81:8: error: always_inline function '_mm_perm_epi8' requires target feature 'fma4', but would be inlined into function 'blake512_compress' that is compiled without support for 'fma4'
hash.c: m0 = BSWAP64(m0);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:82:8: error: always_inline function '_mm_perm_epi8' requires target feature 'fma4', but would be inlined into function 'blake512_compress' that is compiled without support for 'fma4'
hash.c: m1 = BSWAP64(m1);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:83:8: error: always_inline function '_mm_perm_epi8' requires target feature 'fma4', but would be inlined into function 'blake512_compress' that is compiled without support for 'fma4'
hash.c: m2 = BSWAP64(m2);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:84:8: error: always_inline function '_mm_perm_epi8' requires target feature 'fma4', but would be inlined into function 'blake512_compress' that is compiled without support for 'fma4'
hash.c: m3 = BSWAP64(m3);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:85:8: error: always_inline function '_mm_perm_epi8' requires target feature 'fma4', but would be inlined into function 'blake512_compress' that is compiled without support for 'fma4'
hash.c: ...

Number of similar (compiler,implementation) pairs: 2, namely:
CompilerImplementations
clang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments xop
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments xop

Compiler output

Implementation: crypto_hash/bblake512/xop
Compiler: gcc
hash.c: hash.c:5:23: error: x86intrin.h: No such file or directory
hash.c: hash.c:43: error: expected specifier-qualifier-list before '__m128i'
hash.c: hash.c: In function 'blake512_compress':
hash.c: hash.c:60: error: '__m128i' undeclared (first use in this function)
hash.c: hash.c:60: error: (Each undeclared identifier is reported only once
hash.c: hash.c:60: error: for each function it appears in.)
hash.c: hash.c:60: error: expected ';' before 'row1l'
hash.c: hash.c:61: error: expected ';' before 'row2l'
hash.c: hash.c:62: error: expected ';' before 'row3l'
hash.c: hash.c:63: error: expected ';' before 'row4l'
hash.c: hash.c:66: error: expected '=', ',', ';', 'asm' or '__attribute__' before 'u8to64'
hash.c: hash.c:66: error: 'u8to64' undeclared (first use in this function)
hash.c: hash.c:68: error: expected ';' before 'm0'
hash.c: hash.c:69: error: expected ';' before 't0'
hash.c: hash.c:70: error: expected ';' before 'b0'
hash.c: hash.c:72: error: 'm0' undeclared (first use in this function)
hash.c: hash.c:72: error: expected expression before ')' token
hash.c: hash.c:73: error: 'm1' undeclared (first use in this function)
hash.c: hash.c:73: error: expected expression before ')' token
hash.c: hash.c:74: error: 'm2' undeclared (first use in this function)
hash.c: hash.c:74: error: expected expression before ')' token
hash.c: hash.c:75: error: 'm3' undeclared (first use in this function)
hash.c: hash.c:75: error: expected expression before ')' token
hash.c: hash.c:76: error: 'm4' undeclared (first use in this function)
hash.c: hash.c:76: error: expected expression before ')' token
hash.c: ...

Number of similar (compiler,implementation) pairs: 66, namely:
CompilerImplementations
gcc xop
gcc -O2 -fomit-frame-pointer xop
gcc -O3 -fomit-frame-pointer xop
gcc -O -fomit-frame-pointer xop
gcc -Os -fomit-frame-pointer xop
gcc -fno-schedule-insns -O2 -fomit-frame-pointer xop
gcc -fno-schedule-insns -O3 -fomit-frame-pointer xop
gcc -fno-schedule-insns -O -fomit-frame-pointer xop
gcc -fno-schedule-insns -Os -fomit-frame-pointer xop
gcc -funroll-loops xop
gcc -funroll-loops -O2 -fomit-frame-pointer xop
gcc -funroll-loops -O3 -fomit-frame-pointer xop
gcc -funroll-loops -O -fomit-frame-pointer xop
gcc -funroll-loops -Os -fomit-frame-pointer xop
gcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer xop
gcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer xop
gcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer xop
gcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer xop
gcc -funroll-loops -m64 -O2 -fomit-frame-pointer xop
gcc -funroll-loops -m64 -O3 -fomit-frame-pointer xop
gcc -funroll-loops -m64 -O -fomit-frame-pointer xop
gcc -funroll-loops -m64 -Os -fomit-frame-pointer xop
gcc -funroll-loops -m64 -march=k8 -O2 -fomit-frame-pointer xop
gcc -funroll-loops -m64 -march=k8 -O3 -fomit-frame-pointer xop
gcc -funroll-loops -m64 -march=k8 -O -fomit-frame-pointer xop
gcc -funroll-loops -m64 -march=k8 -Os -fomit-frame-pointer xop
gcc -funroll-loops -m64 -march=nocona -O2 -fomit-frame-pointer xop
gcc -funroll-loops -m64 -march=nocona -O3 -fomit-frame-pointer xop
gcc -funroll-loops -m64 -march=nocona -O -fomit-frame-pointer xop
gcc -funroll-loops -m64 -march=nocona -Os -fomit-frame-pointer xop
gcc -funroll-loops -march=k8 -O2 -fomit-frame-pointer xop
gcc -funroll-loops -march=k8 -O3 -fomit-frame-pointer xop
gcc -funroll-loops -march=k8 -O -fomit-frame-pointer xop
gcc -funroll-loops -march=k8 -Os -fomit-frame-pointer xop
gcc -funroll-loops -march=nocona -O2 -fomit-frame-pointer xop
gcc -funroll-loops -march=nocona -O3 -fomit-frame-pointer xop
gcc -funroll-loops -march=nocona -O -fomit-frame-pointer xop
gcc -funroll-loops -march=nocona -Os -fomit-frame-pointer xop
gcc -m64 -O2 -fomit-frame-pointer xop
gcc -m64 -O3 -fomit-frame-pointer xop
gcc -m64 -O -fomit-frame-pointer xop
gcc -m64 -Os -fomit-frame-pointer xop
gcc -m64 -march=k8 -O2 -fomit-frame-pointer xop
gcc -m64 -march=k8 -O3 -fomit-frame-pointer xop
gcc -m64 -march=k8 -O -fomit-frame-pointer xop
gcc -m64 -march=k8 -Os -fomit-frame-pointer xop
gcc -m64 -march=native -mtune=native -O2 -fomit-frame-pointer xop
gcc -m64 -march=native -mtune=native -O3 -fomit-frame-pointer xop
gcc -m64 -march=native -mtune=native -O -fomit-frame-pointer xop
gcc -m64 -march=native -mtune=native -Os -fomit-frame-pointer xop
gcc -m64 -march=nocona -O2 -fomit-frame-pointer xop
gcc -m64 -march=nocona -O3 -fomit-frame-pointer xop
gcc -m64 -march=nocona -O -fomit-frame-pointer xop
gcc -m64 -march=nocona -Os -fomit-frame-pointer xop
gcc -march=k8 -O2 -fomit-frame-pointer xop
gcc -march=k8 -O3 -fomit-frame-pointer xop
gcc -march=k8 -O -fomit-frame-pointer xop
gcc -march=k8 -Os -fomit-frame-pointer xop
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv xop
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv xop
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv xop
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv xop
gcc -march=nocona -O2 -fomit-frame-pointer xop
gcc -march=nocona -O3 -fomit-frame-pointer xop
gcc -march=nocona -O -fomit-frame-pointer xop
gcc -march=nocona -Os -fomit-frame-pointer xop