Implementation notes: amd64, colossus7, crypto_hash/blake512

Computer: colossus7
Architecture: amd64
CPU ID: AuthenticAMD-00830f10-178bfbff
SUPERCOP version: 20210125
Operation: crypto_hash
Primitive: blake512
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
650212208 0 024206 752 752sse2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
652512208 0 024206 752 752sse2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
657012240 0 024126 752 752sse2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
657011112 0 020584 744 736sse2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
666011994 0 023998 752 752sse2sclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
666012026 0 023918 752 752sse2sclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
666010898 0 020376 744 736sse2sclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
668212054 0 024062 752 752ssse3clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
670512054 0 024046 752 752sse41clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
670512086 0 023950 752 752sse41clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
670512054 0 024046 752 752sse41clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
670512054 0 024062 752 752ssse3clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
670510958 0 020440 744 736ssse3clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
672712086 0 023982 752 752ssse3clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
672813462 0 025382 752 736sse2sclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
675011994 0 023998 752 752sse2sclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
679510920 0 020384 744 736sse41clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
744824304 0 036342 752 736sphlibclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
751514447 0 026358 752 736bswapclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
760514575 0 026486 752 736regsclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
760514419 0 026342 752 736sse2clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
789726605 0 036192 744 736sphlibclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
798726615 0 038622 752 752sphlibclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
798826615 0 038750 752 752sphlibclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
803226615 0 038750 752 752sphlibclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
88877623 0 019774 752 752sphlib-smallclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
89107623 0 019646 752 752sphlib-smallclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
89107623 0 019774 752 752sphlib-smallclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
89557412 0 017040 744 736sphlib-smallclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
94058720 0 020774 752 736sphlib-smallclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
96075484 0 017374 752 752refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
96525452 0 017454 752 752refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
96525452 0 017454 752 752refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
99454464 0 016374 752 736refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
111838437 0 017880 744 736bswapclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
112059542 0 021518 752 752bswapclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
112059574 0 021438 752 752bswapclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
112959542 0 021518 752 752bswapclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
1176810662 0 022662 752 752regsclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
1181210694 0 022582 752 752regsclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
118359566 0 019040 744 736regsclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
1188010662 0 022662 752 752regsclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
1575014786 0 026798 752 752sandyclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
1575014818 0 026718 752 752sandyclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
1577214786 0 026798 752 752sandyclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
1577213681 0 023160 744 736sandyclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
157953281 0 012760 744 736refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
1581815038 0 026950 752 736sandyclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125

Compiler output

Implementation: sse41
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
hash.c: In file included from hash.c:8:
hash.c: ./rounds.h:8:10: warning: '_mm_roti_epi64' macro redefined [-Wmacro-redefined]
hash.c: #define _mm_roti_epi64(x, c) \
hash.c: ^
hash.c: /usr/lib/llvm-10/lib/clang/10.0.0/include/xopintrin.h:236:9: note: previous definition is here
hash.c: #define _mm_roti_epi64(A, N) \
hash.c: ^
hash.c: 1 warning generated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE sse41
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE sse41
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE sse41
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE sse41

Compiler output

Implementation: sse41
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
hash.c: In file included from hash.c:8:
hash.c: ./rounds.h:8:10: warning: '_mm_roti_epi64' macro redefined [-Wmacro-redefined]
hash.c: #define _mm_roti_epi64(x, c) \
hash.c: ^
hash.c: /usr/lib/llvm-10/lib/clang/10.0.0/include/xopintrin.h:236:9: note: previous definition is here
hash.c: #define _mm_roti_epi64(A, N) \
hash.c: ^
hash.c: hash.c:81:8: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'blake512_compress' that is compiled without support for 'ssse3'
hash.c: m0 = BSWAP64(m0);
hash.c: ^
hash.c: ./rounds.h:6:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_shuffle_epi8((x), u8to64)
hash.c: ^
hash.c: hash.c:82:8: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'blake512_compress' that is compiled without support for 'ssse3'
hash.c: m1 = BSWAP64(m1);
hash.c: ^
hash.c: ./rounds.h:6:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_shuffle_epi8((x), u8to64)
hash.c: ^
hash.c: hash.c:83:8: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'blake512_compress' that is compiled without support for 'ssse3'
hash.c: m2 = BSWAP64(m2);
hash.c: ^
hash.c: ./rounds.h:6:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_shuffle_epi8((x), u8to64)
hash.c: ^
hash.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE sse41

Compiler output

Implementation: ssse3
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
hash.c: hash.c:141:15: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'blake512_compress' that is compiled without support for 'ssse3'
hash.c: m.u128[0] = _mm_shuffle_epi8(_mm_loadu_si128((__m128i*)(datablock + 00)), u8to64);
hash.c: ^
hash.c: hash.c:142:15: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'blake512_compress' that is compiled without support for 'ssse3'
hash.c: m.u128[1] = _mm_shuffle_epi8(_mm_loadu_si128((__m128i*)(datablock + 16)), u8to64);
hash.c: ^
hash.c: hash.c:143:15: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'blake512_compress' that is compiled without support for 'ssse3'
hash.c: m.u128[2] = _mm_shuffle_epi8(_mm_loadu_si128((__m128i*)(datablock + 32)), u8to64);
hash.c: ^
hash.c: hash.c:144:15: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'blake512_compress' that is compiled without support for 'ssse3'
hash.c: m.u128[3] = _mm_shuffle_epi8(_mm_loadu_si128((__m128i*)(datablock + 48)), u8to64);
hash.c: ^
hash.c: hash.c:145:15: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'blake512_compress' that is compiled without support for 'ssse3'
hash.c: m.u128[4] = _mm_shuffle_epi8(_mm_loadu_si128((__m128i*)(datablock + 64)), u8to64);
hash.c: ^
hash.c: hash.c:146:15: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'blake512_compress' that is compiled without support for 'ssse3'
hash.c: m.u128[5] = _mm_shuffle_epi8(_mm_loadu_si128((__m128i*)(datablock + 80)), u8to64);
hash.c: ^
hash.c: hash.c:147:15: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'blake512_compress' that is compiled without support for 'ssse3'
hash.c: m.u128[6] = _mm_shuffle_epi8(_mm_loadu_si128((__m128i*)(datablock + 96)), u8to64);
hash.c: ^
hash.c: hash.c:148:15: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'blake512_compress' that is compiled without support for 'ssse3'
hash.c: m.u128[7] = _mm_shuffle_epi8(_mm_loadu_si128((__m128i*)(datablock + 112)), u8to64);
hash.c: ^
hash.c: hash.c:291:3: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'blake512_compress' that is compiled without support for 'ssse3'
hash.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ssse3

Compiler output

Implementation: xop
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
hash.c: hash.c:81:8: error: always_inline function '_mm_perm_epi8' requires target feature 'xop', but would be inlined into function 'blake512_compress' that is compiled without support for 'xop'
hash.c: m0 = BSWAP64(m0);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:82:8: error: always_inline function '_mm_perm_epi8' requires target feature 'xop', but would be inlined into function 'blake512_compress' that is compiled without support for 'xop'
hash.c: m1 = BSWAP64(m1);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:83:8: error: always_inline function '_mm_perm_epi8' requires target feature 'xop', but would be inlined into function 'blake512_compress' that is compiled without support for 'xop'
hash.c: m2 = BSWAP64(m2);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:84:8: error: always_inline function '_mm_perm_epi8' requires target feature 'xop', but would be inlined into function 'blake512_compress' that is compiled without support for 'xop'
hash.c: m3 = BSWAP64(m3);
hash.c: ^
hash.c: ./rounds.h:13:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:85:8: error: always_inline function '_mm_perm_epi8' requires target feature 'xop', but would be inlined into function 'blake512_compress' that is compiled without support for 'xop'
hash.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE xop
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE xop
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE xop
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE xop
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE xop

Compiler output

Implementation: xop-2
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
hash.c: hash.c:92:15: error: always_inline function '_mm_perm_epi8' requires target feature 'xop', but would be inlined into function 'blake512_compress' that is compiled without support for 'xop'
hash.c: m.u128[0] = BSWAP64(m.u128[0]);
hash.c: ^
hash.c: ./rounds.h:15:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:93:15: error: always_inline function '_mm_perm_epi8' requires target feature 'xop', but would be inlined into function 'blake512_compress' that is compiled without support for 'xop'
hash.c: m.u128[1] = BSWAP64(m.u128[1]);
hash.c: ^
hash.c: ./rounds.h:15:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:94:15: error: always_inline function '_mm_perm_epi8' requires target feature 'xop', but would be inlined into function 'blake512_compress' that is compiled without support for 'xop'
hash.c: m.u128[2] = BSWAP64(m.u128[2]);
hash.c: ^
hash.c: ./rounds.h:15:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:95:15: error: always_inline function '_mm_perm_epi8' requires target feature 'xop', but would be inlined into function 'blake512_compress' that is compiled without support for 'xop'
hash.c: m.u128[3] = BSWAP64(m.u128[3]);
hash.c: ^
hash.c: ./rounds.h:15:21: note: expanded from macro 'BSWAP64'
hash.c: #define BSWAP64(x) _mm_perm_epi8((x),(x),u8to64)
hash.c: ^
hash.c: hash.c:96:15: error: always_inline function '_mm_perm_epi8' requires target feature 'xop', but would be inlined into function 'blake512_compress' that is compiled without support for 'xop'
hash.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE xop-2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE xop-2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE xop-2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE xop-2
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE xop-2