Implementation notes: aarch64, gcc185, crypto_hashblocks/sha256

Computer: gcc185
Microarchitecture: aarch64; Skylark (503f0002)
Architecture: aarch64
CPU ID: 503f0002
SUPERCOP version: 20240107
Operation: crypto_hashblocks
Primitive: sha256
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
6600676 256 010813 1048 736dolbeau/armv8cryptogcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
6675684 256 011661 1064 744dolbeau/armv8cryptogcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
7275768 256 011869 1064 744dolbeau/armv8cryptogcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
7275768 256 013054 1072 760dolbeau/armv8cryptogcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
279008128 0 020250 800 736inplaceclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
279008128 0 018634 800 736inplaceclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
279008128 0 018380 792 736inplaceclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
279008132 0 020250 800 736refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
279008132 0 022018 800 744refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
279008132 0 018380 792 736refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
279008132 0 022018 800 744refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
279758132 0 018634 800 736refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
280508128 0 022018 800 744inplaceclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
280508128 0 022018 800 744inplaceclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121320231212
280508368 0 019461 808 744inplacegcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
280508364 0 020654 816 760inplacegcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
280508304 0 019389 808 744refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
280508304 0 020574 816 760refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
287258236 0 018373 792 736refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
288008236 0 018373 792 736inplacegcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
306758500 0 019477 808 744refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212
319508500 0 019485 808 744inplacegcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121320231212

Compiler output

Implementation: dolbeau/amd64-sha
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
blocks.c: In file included from blocks.c:37:
blocks.c: /usr/bin/../lib/clang/17/include/immintrin.h:14:2: error: "This header is only meant to be used on x86 and x64 architecture"
blocks.c: 14 | #error "This header is only meant to be used on x86 and x64 architecture"
blocks.c: | ^
blocks.c: In file included from blocks.c:37:
blocks.c: In file included from /usr/bin/../lib/clang/17/include/immintrin.h:17:
blocks.c: In file included from /usr/bin/../lib/clang/17/include/x86gprintrin.h:15:
blocks.c: /usr/bin/../lib/clang/17/include/hresetintrin.h:42:27: error: invalid input constraint 'a' in asm
blocks.c: 42 | __asm__ ("hreset $0" :: "a"(__eax));
blocks.c: | ^
blocks.c: In file included from blocks.c:37:
blocks.c: In file included from /usr/bin/../lib/clang/17/include/immintrin.h:21:
blocks.c: /usr/bin/../lib/clang/17/include/mmintrin.h:14:2: error: "This header is only meant to be used on x86 and x64 architecture"
blocks.c: 14 | #error "This header is only meant to be used on x86 and x64 architecture"
blocks.c: | ^
blocks.c: /usr/bin/../lib/clang/17/include/mmintrin.h:54:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
blocks.c: 54 | return (__m64)__builtin_ia32_vec_init_v2si(__i, 0);
blocks.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
blocks.c: /usr/bin/../lib/clang/17/include/mmintrin.h:133:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
blocks.c: 133 | return (__m64)__builtin_ia32_packsswb((__v4hi)__m1, (__v4hi)__m2);
blocks.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
blocks.c: /usr/bin/../lib/clang/17/include/mmintrin.h:163:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
blocks.c: 163 | return (__m64)__builtin_ia32_packssdw((__v2si)__m1, (__v2si)__m2);
blocks.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
blocks.c: /usr/bin/../lib/clang/17/include/mmintrin.h:193:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
blocks.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/amd64-sha
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/amd64-sha
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/amd64-sha
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/amd64-sha
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/amd64-sha

Compiler output

Implementation: dolbeau/amd64-sha
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
blocks.c: blocks.c:37:10: fatal error: immintrin.h: No such file or directory
blocks.c: #include <immintrin.h>
blocks.c: ^~~~~~~~~~~~~
blocks.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/amd64-sha
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/amd64-sha
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/amd64-sha
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE dolbeau/amd64-sha

Compiler output

Implementation: dolbeau/armv8crypto
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
blocks.c: blocks.c:134:3: error: always_inline function 'vsha256hq_u32' requires target feature 'sha2', but would be inlined into function 'crypto_hashblocks_sha256_dolbeau_armv8crypto_constbranchindex' that is compiled without support for 'sha2'
blocks.c: 134 | DO16ROUNDS(i0, i1, i2, i3, c0, c1, c2, c3);
blocks.c: | ^
blocks.c: blocks.c:108:8: note: expanded from macro 'DO16ROUNDS'
blocks.c: 108 | x0 = vsha256hq_u32(s0, s1, h0); \
blocks.c: | ^
blocks.c: blocks.c:134:3: error: always_inline function 'vsha256h2q_u32' requires target feature 'sha2', but would be inlined into function 'crypto_hashblocks_sha256_dolbeau_armv8crypto_constbranchindex' that is compiled without support for 'sha2'
blocks.c: blocks.c:109:8: note: expanded from macro 'DO16ROUNDS'
blocks.c: 109 | x1 = vsha256h2q_u32(s1, s0, h0); \
blocks.c: | ^
blocks.c: blocks.c:134:3: error: always_inline function 'vsha256hq_u32' requires target feature 'sha2', but would be inlined into function 'crypto_hashblocks_sha256_dolbeau_armv8crypto_constbranchindex' that is compiled without support for 'sha2'
blocks.c: blocks.c:111:8: note: expanded from macro 'DO16ROUNDS'
blocks.c: 111 | s0 = vsha256hq_u32(x0, x1, h1); \
blocks.c: | ^
blocks.c: blocks.c:134:3: error: always_inline function 'vsha256h2q_u32' requires target feature 'sha2', but would be inlined into function 'crypto_hashblocks_sha256_dolbeau_armv8crypto_constbranchindex' that is compiled without support for 'sha2'
blocks.c: blocks.c:112:8: note: expanded from macro 'DO16ROUNDS'
blocks.c: 112 | s1 = vsha256h2q_u32(x1, x0, h1); \
blocks.c: | ^
blocks.c: blocks.c:134:3: error: always_inline function 'vsha256hq_u32' requires target feature 'sha2', but would be inlined into function 'crypto_hashblocks_sha256_dolbeau_armv8crypto_constbranchindex' that is compiled without support for 'sha2'
blocks.c: blocks.c:114:8: note: expanded from macro 'DO16ROUNDS'
blocks.c: 114 | x0 = vsha256hq_u32(s0, s1, h0); \
blocks.c: | ^
blocks.c: blocks.c:134:3: error: always_inline function 'vsha256h2q_u32' requires target feature 'sha2', but would be inlined into function 'crypto_hashblocks_sha256_dolbeau_armv8crypto_constbranchindex' that is compiled without support for 'sha2'
blocks.c: blocks.c:115:8: note: expanded from macro 'DO16ROUNDS'
blocks.c: 115 | x1 = vsha256h2q_u32(s1, s0, h0); \
blocks.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/armv8crypto
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/armv8crypto
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/armv8crypto
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/armv8crypto
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE dolbeau/armv8crypto