Implementation notes: amd64, hertz, crypto_hash/blake3

Computer: hertz
Microarchitecture: amd64; Zen 4 (a60f12)
Architecture: amd64
CPU ID: AuthenticAMD-00a60f12-178bfbff
SUPERCOP version: 20240716
Operation: crypto_hash
Primitive: blake3
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
575717145 0 027472 788 936T:avx512gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
576117971 0 035838 836 968T:avx512clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
576417971 0 035710 836 968T:avx512clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
576417140 0 028768 828 968T:avx512clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
580117847 0 029565 812 968T:avx512gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
581218773 0 032549 812 1032T:avx512gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716

Compiler output


blake3.c: In file included from blake3.c:12:
blake3.c: ./blake3_static_dispatch.h:8:2: error: "there are wider implementations on this platform; fail the build"
blake3.c:     8 | #error "there are wider implementations on this platform; fail the build"
blake3.c:       |  ^
blake3.c: 1 error generated.

Number of similar (implementation,compiler) pairs: 9, namely:
ImplementationCompiler
T:avx2clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
T:avx2clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
T:avx2clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
T:portableclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
T:portableclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
T:portableclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
T:sse41clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
T:sse41clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
T:sse41clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))

Compiler output


blake3.c: In file included from blake3.c:12:
blake3.c: blake3_static_dispatch.h:8:2: error: #error "there are wider implementations on this platform; fail the build"
blake3.c:     8 | #error "there are wider implementations on this platform; fail the build"
blake3.c:       |  ^~~~~

Number of similar (implementation,compiler) pairs: 9, namely:
ImplementationCompiler
T:avx2gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
T:avx2gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
T:avx2gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
T:portablegcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
T:portablegcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
T:portablegcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
T:sse41gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
T:sse41gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
T:sse41gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)

Compiler output


blake3_neon.c: In file included from blake3_neon.c:3:
blake3_neon.c: /usr/lib/llvm-18/lib/clang/18/include/arm_neon.h:28:2: error: "NEON intrinsics not available with the soft-float ABI. Please use -mfloat-abi=softfp or -mfloat-abi=hard"
blake3_neon.c:    28 | #error "NEON intrinsics not available with the soft-float ABI. Please use -mfloat-abi=softfp or -mfloat-abi=hard"
blake3_neon.c:       |  ^
blake3_neon.c: blake3_neon.c:11:8: error: unknown type name 'uint32x4_t'; did you mean 'uint32_t'?
blake3_neon.c:    11 | INLINE uint32x4_t loadu_128(const uint8_t src[16]) {
blake3_neon.c:       |        ^~~~~~~~~~
blake3_neon.c:       |        uint32_t
blake3_neon.c: /usr/include/x86_64-linux-gnu/bits/stdint-uintn.h:26:20: note: 'uint32_t' declared here
blake3_neon.c:    26 | typedef __uint32_t uint32_t;
blake3_neon.c:       |                    ^
blake3_neon.c: blake3_neon.c:13:3: error: unknown type name 'uint32x4_t'; did you mean 'uint32_t'?
blake3_neon.c:    13 |   uint32x4_t x;
blake3_neon.c:       |   ^~~~~~~~~~
blake3_neon.c:       |   uint32_t
blake3_neon.c: /usr/include/x86_64-linux-gnu/bits/stdint-uintn.h:26:20: note: 'uint32_t' declared here
blake3_neon.c:    26 | typedef __uint32_t uint32_t;
blake3_neon.c:       |                    ^
blake3_neon.c: blake3_neon.c:14:3: warning: 'memcpy' will always overflow; destination buffer has size 4, but size argument is 16 [-Wfortify-source]
blake3_neon.c:    14 |   memcpy(&x, src, 16);
blake3_neon.c:       |   ^
blake3_neon.c: blake3_neon.c:18:24: error: unknown type name 'uint32x4_t'; did you mean 'uint32_t'?
blake3_neon.c:    18 | INLINE void storeu_128(uint32x4_t src, uint8_t dest[16]) {
blake3_neon.c:       |                        ^~~~~~~~~~
blake3_neon.c:       |                        uint32_t
blake3_neon.c: ...

Number of similar (implementation,compiler) pairs: 3, namely:
ImplementationCompiler
T:neonclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
T:neonclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
T:neonclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))

Compiler output


blake3_neon.c: blake3_neon.c:3:10: fatal error: arm_neon.h: No such file or directory
blake3_neon.c:     3 | #include <arm_neon.h>
blake3_neon.c:       |          ^~~~~~~~~~~~
blake3_neon.c: compilation terminated.

Number of similar (implementation,compiler) pairs: 3, namely:
ImplementationCompiler
T:neongcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
T:neongcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
T:neongcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)

Namespace violations


blake3.o blake3_default_hash T
blake3_avx512_x86-64_unix.o _blake3_compress_in_place_avx512 T
blake3_avx512_x86-64_unix.o _blake3_compress_xof_avx512 T
blake3_avx512_x86-64_unix.o _blake3_hash_many_avx512 T
blake3_avx512_x86-64_unix.o blake3_compress_in_place_avx512 T
blake3_avx512_x86-64_unix.o blake3_compress_xof_avx512 T
blake3_avx512_x86-64_unix.o blake3_hash_many_avx512 T

Number of similar (implementation,compiler) pairs: 6, namely:
ImplementationCompiler
T:avx512clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
T:avx512clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
T:avx512clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
T:avx512gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
T:avx512gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
T:avx512gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)