Implementation notes: amd64, hydra7, crypto_hashblocks/sha512

Computer: hydra7
Microarchitecture: amd64; Sandy Bridge+AES (206a7)
Architecture: amd64
CPU ID: GenuineIntel-000206a7-bfebfbff
SUPERCOP version: 20240425
Operation: crypto_hashblocks
Primitive: sha512
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
201073691 0 013696 780 928wflipgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
201146256 0 018973 804 960compactgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
202744454 0 015941 804 960compactgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
2038916145 0 026192 780 928inplacegcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
2042716135 0 026176 780 928refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
2096916973 0 029637 804 960refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
2101617065 0 029765 804 960inplacegcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
212003720 0 015133 804 960wflipgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
212703794 0 016429 804 960wflipgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
213884461 0 014544 780 928compactgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
214443628 0 014684 796 960wflipgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
2170316280 0 027725 804 960inplacegcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
218374576 0 015732 796 960compactgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
2191816455 0 027869 804 960refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
2274316444 0 027564 796 960inplacegcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
2288016524 0 027612 796 960refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
235163529 0 016245 804 960compact4gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
237531396 0 011472 780 928compact4gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
242081529 0 013013 804 960compact4gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
266471546 0 012692 796 960compact4gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
347833544 0 016269 804 960compact2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
353131774 0 013261 804 960compact2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
362911752 0 011832 780 928compact2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
363091521 0 011584 780 928compact3gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
363213653 0 016357 804 960compact3gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
365901698 0 013189 804 960compact3gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
371571873 0 013036 796 960compact2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
381681693 0 012836 796 960compact3gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425

Test failure

Implementation: avx2
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
error 111

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
inner.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
inner.c: from inner.c:1:
inner.c: inner.c: In function 'crypto_hashblocks_sha512_avx_constbranchindex_inner':
inner.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:126:1: error: inlining failed in call to 'always_inline' '_mm256_add_epi64': target specific option mismatch
inner.c: 126 | _mm256_add_epi64 (__m256i __A, __m256i __B)
inner.c: | ^~~~~~~~~~~~~~~~
inner.c: inner.c:128:15: note: called from here
inner.c: 128 | D12 = _mm256_add_epi64(X12,D12);
inner.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~
inner.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
inner.c: from inner.c:1:
inner.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:588:1: error: inlining failed in call to 'always_inline' '_mm256_shuffle_epi8': target specific option mismatch
inner.c: 588 | _mm256_shuffle_epi8 (__m256i __X, __m256i __Y)
inner.c: | ^~~~~~~~~~~~~~~~~~~
inner.c: inner.c:126:15: note: called from here
inner.c: 126 | X12 = _mm256_shuffle_epi8(X12,bigendian64);
inner.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
inner.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
inner.c: from inner.c:1:
inner.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:126:1: error: inlining failed in call to 'always_inline' '_mm256_add_epi64': target specific option mismatch
inner.c: 126 | _mm256_add_epi64 (__m256i __A, __m256i __B)
inner.c: | ^~~~~~~~~~~~~~~~
inner.c: inner.c:115:14: note: called from here
inner.c: 115 | D8 = _mm256_add_epi64(X8,D8);
inner.c: | ^~~~~~~~~~~~~~~~~~~~~~~
inner.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE avx
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE avx

Compiler output

Implementation: dolbeau/intelavx2rorxasm
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
try.c: /usr/bin/ld: libcrypto_hashblocks_sha512.a(blocks.o): in function `crypto_hashblocks_sha512_dolbeau_intelavx2rorxasm_constbranchindex':
try.c: blocks.c:(.text+0x...): undefined reference to `sha512_rorx'
try.c: collect2: error: ld returned 1 exit status

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelavx2rorxasm
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelavx2rorxasm
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelavx2rorxasm
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelavx2rorxasm

Compiler output

Implementation: dolbeau/intelavxasm
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
try.c: /usr/bin/ld: libcrypto_hashblocks_sha512.a(blocks.o): in function `crypto_hashblocks_sha512_dolbeau_intelavxasm_constbranchindex':
try.c: blocks.c:(.text+0x...): undefined reference to `sha512_avx'
try.c: collect2: error: ld returned 1 exit status

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelavxasm
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelavxasm
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelavxasm
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelavxasm

Compiler output

Implementation: dolbeau/intelsse4asm
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
try.c: /usr/bin/ld: libcrypto_hashblocks_sha512.a(blocks.o): in function `crypto_hashblocks_sha512_dolbeau_intelsse4asm_constbranchindex':
try.c: blocks.c:(.text+0x...): undefined reference to `sha512_sse4'
try.c: collect2: error: ld returned 1 exit status

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelsse4asm
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelsse4asm
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelsse4asm
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/intelsse4asm