Implementation notes: amd64, firefly, crypto_hash/blake3

Computer: firefly
Architecture: amd64
CPU ID: AuthenticAMD-00800f12-178bfbff
SUPERCOP version: 20201130
Operation: crypto_hash
Primitive: blake3
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
464624577 0 035903 792 752T:avx2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121320201130
464614693 0 026887 792 752T:sse41clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121320201130
473813191 0 023669 808 776T:sse41gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121320201130
476123340 0 035015 792 736T:avx2clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121320201130
478422410 0 032197 784 736T:avx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121320201130
478423075 0 033589 808 776T:avx2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121320201130
480724577 0 035903 792 752T:avx2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121320201130
480713258 0 023957 808 776T:sse41gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121320201130
489913456 0 025095 792 736T:sse41clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121320201130
494514693 0 025983 792 752T:sse41clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121320201130
496812526 0 022277 784 736T:sse41clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121320201130
503726385 0 039126 816 776T:avx2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121320201130
508322156 0 031773 792 776T:avx2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121320201130
508316213 0 028886 816 776T:sse41gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121320201130
526723126 0 033813 808 776T:avx2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121320201130
535914693 0 025983 792 752T:sse41clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121320201130
542824577 0 036807 792 752T:avx2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121320201130
607212272 0 021853 792 776T:sse41gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121320201130
742910664 0 022263 792 736T:portableclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121320201130
782013856 0 026486 816 776T:portablegcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121320201130
878610893 0 021557 808 776T:portablegcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121320201130
95919924 0 019469 792 776T:portablegcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121320201130
970612017 0 022477 808 776T:portablegcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020121320201130
1081013197 0 025319 792 752T:portableclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121320201130
1081013197 0 024415 792 752T:portableclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121320201130
1248910995 0 020645 784 736T:portableclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121320201130
1403013197 0 024415 792 752T:portableclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020121320201130

Test failure

Implementation: T:avx512
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
error 111

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx512
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx512
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx512
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx512

Compiler output

Implementation: T:avx512
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
blake3_avx512_x86-64_unix.S: blake3_avx512_x86-64_unix.S:28:9: error: instruction requires: AVX-512 ISA
blake3_avx512_x86-64_unix.S: kmovw k1, r9d
blake3_avx512_x86-64_unix.S: ^
blake3_avx512_x86-64_unix.S: blake3_avx512_x86-64_unix.S:38:9: error: instruction requires: AVX-512 ISA AVX-512 VL ISA
blake3_avx512_x86-64_unix.S: vpcmpltud k2, ymm2, ymm0
blake3_avx512_x86-64_unix.S: ^
blake3_avx512_x86-64_unix.S: blake3_avx512_x86-64_unix.S:39:9: error: instruction requires: AVX-512 ISA AVX-512 VL ISA
blake3_avx512_x86-64_unix.S: vpcmpltud k3, ymm3, ymm0
blake3_avx512_x86-64_unix.S: ^
blake3_avx512_x86-64_unix.S: blake3_avx512_x86-64_unix.S:40:21: error: unexpected token in argument list
blake3_avx512_x86-64_unix.S: vpaddd ymm4 {k2}, ymm4, dword ptr [ADD1+rip] {1to8}
blake3_avx512_x86-64_unix.S: ^
blake3_avx512_x86-64_unix.S: blake3_avx512_x86-64_unix.S:41:21: error: unexpected token in argument list
blake3_avx512_x86-64_unix.S: vpaddd ymm5 {k3}, ymm5, dword ptr [ADD1+rip] {1to8}
blake3_avx512_x86-64_unix.S: ^
blake3_avx512_x86-64_unix.S: blake3_avx512_x86-64_unix.S:42:9: error: instruction requires: AVX-512 ISA
blake3_avx512_x86-64_unix.S: knotw k2, k1
blake3_avx512_x86-64_unix.S: ^
blake3_avx512_x86-64_unix.S: blake3_avx512_x86-64_unix.S:43:24: error: unexpected token in argument list
blake3_avx512_x86-64_unix.S: vmovdqa32 ymm2 {k2}, ymm0
blake3_avx512_x86-64_unix.S: ^
blake3_avx512_x86-64_unix.S: blake3_avx512_x86-64_unix.S:44:24: error: unexpected token in argument list
blake3_avx512_x86-64_unix.S: vmovdqa32 ymm3 {k2}, ymm0
blake3_avx512_x86-64_unix.S: ^
blake3_avx512_x86-64_unix.S: blake3_avx512_x86-64_unix.S:45:24: error: unexpected token in argument list
blake3_avx512_x86-64_unix.S: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx512

Compiler output

Implementation: T:neon
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
blake3.c: In file included from blake3.c:12:
blake3.c: ./blake3_static_dispatch.h:17:9: warning: 'MAX_SIMD_DEGREE' macro redefined [-Wmacro-redefined]
blake3.c: #define MAX_SIMD_DEGREE 4
blake3.c: ^
blake3.c: ./blake3_impl.h:49:9: note: previous definition is here
blake3.c: #define MAX_SIMD_DEGREE 16
blake3.c: ^
blake3.c: In file included from blake3.c:12:
blake3.c: ./blake3_static_dispatch.h:18:9: warning: 'MAX_SIMD_DEGREE_OR_2' macro redefined [-Wmacro-redefined]
blake3.c: #define MAX_SIMD_DEGREE_OR_2 4
blake3.c: ^
blake3.c: ./blake3_impl.h:58:9: note: previous definition is here
blake3.c: #define MAX_SIMD_DEGREE_OR_2 (MAX_SIMD_DEGREE > 2 ? MAX_SIMD_DEGREE : 2)
blake3.c: ^
blake3.c: 2 warnings generated.
blake3_neon.c: In file included from blake3_neon.c:3:
blake3_neon.c: /usr/lib/llvm-3.8/bin/../lib/clang/3.8.1/include/arm_neon.h:28:2: error: "NEON support not enabled"
blake3_neon.c: #error "NEON support not enabled"
blake3_neon.c: ^
blake3_neon.c: /usr/lib/llvm-3.8/bin/../lib/clang/3.8.1/include/arm_neon.h:48:24: error: 'neon_vector_type' attribute is not supported for this target
blake3_neon.c: typedef __attribute__((neon_vector_type(8))) int8_t int8x8_t;
blake3_neon.c: ^
blake3_neon.c: /usr/lib/llvm-3.8/bin/../lib/clang/3.8.1/include/arm_neon.h:49:24: error: 'neon_vector_type' attribute is not supported for this target
blake3_neon.c: typedef __attribute__((neon_vector_type(16))) int8_t int8x16_t;
blake3_neon.c: ^
blake3_neon.c: /usr/lib/llvm-3.8/bin/../lib/clang/3.8.1/include/arm_neon.h:50:24: error: 'neon_vector_type' attribute is not supported for this target
blake3_neon.c: typedef __attribute__((neon_vector_type(4))) int16_t int16x4_t;
blake3_neon.c: ^
blake3_neon.c: /usr/lib/llvm-3.8/bin/../lib/clang/3.8.1/include/arm_neon.h:51:24: error: 'neon_vector_type' attribute is not supported for this target
blake3_neon.c: typedef __attribute__((neon_vector_type(8))) int16_t int16x8_t;
blake3_neon.c: ^
blake3_neon.c: /usr/lib/llvm-3.8/bin/../lib/clang/3.8.1/include/arm_neon.h:52:24: error: 'neon_vector_type' attribute is not supported for this target
blake3_neon.c: typedef __attribute__((neon_vector_type(2))) int32_t int32x2_t;
blake3_neon.c: ^
blake3_neon.c: /usr/lib/llvm-3.8/bin/../lib/clang/3.8.1/include/arm_neon.h:53:24: error: 'neon_vector_type' attribute is not supported for this target
blake3_neon.c: typedef __attribute__((neon_vector_type(4))) int32_t int32x4_t;
blake3_neon.c: ^
blake3_neon.c: /usr/lib/llvm-3.8/bin/../lib/clang/3.8.1/include/arm_neon.h:54:24: error: 'neon_vector_type' attribute is not supported for this target
blake3_neon.c: typedef __attribute__((neon_vector_type(1))) int64_t int64x1_t;
blake3_neon.c: ^
blake3_neon.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon

Compiler output

Implementation: T:neon
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
blake3.c: In file included from blake3.c:12:0:
blake3.c: blake3_static_dispatch.h:17:0: warning: "MAX_SIMD_DEGREE" redefined
blake3.c: #define MAX_SIMD_DEGREE 4
blake3.c:
blake3.c: In file included from blake3.c:6:0:
blake3.c: blake3_impl.h:49:0: note: this is the location of the previous definition
blake3.c: #define MAX_SIMD_DEGREE 16
blake3.c:
blake3.c: In file included from blake3.c:12:0:
blake3.c: blake3_static_dispatch.h:18:0: warning: "MAX_SIMD_DEGREE_OR_2" redefined
blake3.c: #define MAX_SIMD_DEGREE_OR_2 4
blake3.c:
blake3.c: In file included from blake3.c:6:0:
blake3.c: blake3_impl.h:58:0: note: this is the location of the previous definition
blake3.c: #define MAX_SIMD_DEGREE_OR_2 (MAX_SIMD_DEGREE > 2 ? MAX_SIMD_DEGREE : 2)
blake3.c:
blake3_neon.c: blake3_neon.c:3:22: fatal error: arm_neon.h: No such file or directory
blake3_neon.c: #include <arm_neon.h>
blake3_neon.c: ^
blake3_neon.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:neon
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:neon
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:neon
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:neon

Namespace violations

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
blake3.o blake3_compress_subtree_wide T
blake3.o blake3_default_hash T
blake3_avx2_x86-64_unix.o _blake3_hash_many_avx2 T
blake3_avx2_x86-64_unix.o blake3_hash_many_avx2 T
blake3_sse41_x86-64_unix.o _blake3_compress_in_place_sse41 T
blake3_sse41_x86-64_unix.o _blake3_compress_xof_sse41 T
blake3_sse41_x86-64_unix.o _blake3_hash_many_sse41 T
blake3_sse41_x86-64_unix.o blake3_compress_in_place_sse41 T
blake3_sse41_x86-64_unix.o blake3_compress_xof_sse41 T
blake3_sse41_x86-64_unix.o blake3_hash_many_sse41 T

Number of similar (compiler,implementation) pairs: 9, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2

Namespace violations

Implementation: T:portable
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
blake3.o blake3_compress_subtree_wide T
blake3.o blake3_default_hash T
blake3_portable.o blake3_compress_in_place_portable T
blake3_portable.o blake3_compress_xof_portable T
blake3_portable.o blake3_hash_many_portable T

Number of similar (compiler,implementation) pairs: 9, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:portable
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:portable
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:portable
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:portable
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:portable
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:portable
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:portable
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:portable
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:portable

Namespace violations

Implementation: T:sse41
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
blake3.o blake3_compress_subtree_wide T
blake3.o blake3_default_hash T
blake3_sse41_x86-64_unix.o _blake3_compress_in_place_sse41 T
blake3_sse41_x86-64_unix.o _blake3_compress_xof_sse41 T
blake3_sse41_x86-64_unix.o _blake3_hash_many_sse41 T
blake3_sse41_x86-64_unix.o blake3_compress_in_place_sse41 T
blake3_sse41_x86-64_unix.o blake3_compress_xof_sse41 T
blake3_sse41_x86-64_unix.o blake3_hash_many_sse41 T

Number of similar (compiler,implementation) pairs: 9, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse41
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse41
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse41
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse41
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse41
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sse41
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sse41
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sse41
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sse41