Implementation notes: amd64, hertz, crypto_hash/simd256

Computer: hertz
Microarchitecture: amd64; Zen 4 (a60f12)
Architecture: amd64
CPU ID: AuthenticAMD-00a60f12-178bfbff
SUPERCOP version: 20240107
Operation: crypto_hash
Primitive: simd256
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
3003865112 0 079146 844 968T:sphlibclang-17_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121920231217
3078755202 0 069130 844 968T:sphlibclang-17_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121920231217
3198862137 0 075989 804 1032T:sphlibgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
3320756339 0 068141 804 968T:sphlibgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
3674545935 0 057604 836 968T:sphlibclang-17_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121920231217
3725728779 0 040613 804 968T:sphlib-smallgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
3752932249 0 046133 804 1032T:sphlib-smallgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
4008335750 0 049675 844 968T:sphlib-smallclang-17_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121920231217
4241948034 0 062058 844 968T:sphlib-smallclang-17_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121920231217
4412971995 416 086006 1252 1032T:optgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
4478448525 0 058880 780 936T:sphlibgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
4518026590 0 038300 836 968T:sphlib-smallclang-17_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121920231217
5640427074 0 037456 780 936T:sphlib-smallgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
7093230333 388 044454 1264 968T:optclang-17_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121920231217
7097434943 388 048310 1264 968T:optclang-17_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121920231217
7476019513 416 031550 1252 968T:optgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
7808616511 388 028375 1256 968T:optclang-17_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121920231217
8105113443 416 023985 1228 936T:optgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
249032736475 388 051245 1264 968T:refclang-17_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121920231217
249192228379 388 043021 1264 968T:refclang-17_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121920231217
25155675324 388 017167 1256 968T:refclang-17_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121920231217
312025113636 416 027646 1252 1032T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
33326236040 416 018014 1252 968T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
37558904914 416 015417 1228 936T:refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217

Compiler output

Implementation: T:ref
Security model: timingleaks
Compiler: clang-17 -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
reference.c: reference.c:69:82: warning: expression result unused [-Wunused-value]
reference.c: 69 | state->A[j] = state->D[j] + w[j] + F(state->A[j], state->B[j], state->C[j]), s;
reference.c: | ^
reference.c: 1 warning generated.

Number of similar (compiler,implementation) pairs: 3, namely:
CompilerImplementations
clang-17 -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref
clang-17 -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref
clang-17 -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref

Compiler output

Implementation: T:vect128
Security model: timingleaks
Compiler: clang-17 -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
vector.c: vector.c:73:9: warning: 'X' macro redefined [-Wmacro-redefined]
vector.c: 73 | #define X(i) X##i
vector.c: | ^
vector.c: vector.c:68:9: note: previous definition is here
vector.c: 68 | #define X(i) A[i]
vector.c: | ^
vector.c: vector.c:129:3: error: use of unknown builtin '__builtin_ia32_pcmpgtw128' [-Wimplicit-function-declaration]
vector.c: 129 | DO_REDUCE_FULL_S(0);
vector.c: | ^
vector.c: vector.c:56:12: note: expanded from macro 'DO_REDUCE_FULL_S'
vector.c: 56 | X(i) = EXTRA_REDUCE_S(X(i)); \
vector.c: | ^
vector.c: vector.c:42:32: note: expanded from macro 'EXTRA_REDUCE_S'
vector.c: 42 | v16_sub(x, v16_and(V257.v16, v16_cmp(x, V128.v16)))
vector.c: | ^
vector.c: ./vector.h:92:22: note: expanded from macro 'v16_cmp'
vector.c: 92 | #define v16_cmp __builtin_ia32_pcmpgtw128
vector.c: | ^
vector.c: vector.c:129:3: error: cannot convert between scalar type 'int' and vector type 'v16' (aka 'v8hi') as implicit conversion would cause truncation
vector.c: vector.c:56:12: note: expanded from macro 'DO_REDUCE_FULL_S'
vector.c: 56 | X(i) = EXTRA_REDUCE_S(X(i)); \
vector.c: | ^
vector.c: vector.c:42:14: note: expanded from macro 'EXTRA_REDUCE_S'
vector.c: 42 | v16_sub(x, v16_and(V257.v16, v16_cmp(x, V128.v16)))
vector.c: | ^
vector.c: ...

Number of similar (compiler,implementation) pairs: 3, namely:
CompilerImplementations
clang-17 -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:vect128
clang-17 -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:vect128
clang-17 -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:vect128

Compiler output

Implementation: T:vect128
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
vector.c: vector.c: In function 'fft64':
vector.c: vector.c:73: warning: "X" redefined
vector.c: 73 | #define X(i) X##i
vector.c: |
vector.c: vector.c:68: note: this is the location of the previous definition
vector.c: 68 | #define X(i) A[i]
vector.c: |
vector.c: vector.c: In function 'rounds512':
vector.c: vector.c:796: warning: "STEP_1" redefined
vector.c: 796 | #define STEP_1(a,b,c,d,w,fun,r,s,z) \
vector.c: |
vector.c: vector.c:542: note: this is the location of the previous definition
vector.c: 542 | #define STEP_1(a,b,c,d,w,fun,r,s,z) \
vector.c: |
vector.c: vector.c:805: warning: "STEP_2" redefined
vector.c: 805 | #define STEP_2(a,b,c,d,w,fun,r,s) \
vector.c: |
vector.c: vector.c:566: note: this is the location of the previous definition
vector.c: 566 | #define STEP_2(a,b,c,d,w,fun,r,s) \
vector.c: |
vector.c: vector.c:808: warning: "STEP" redefined
vector.c: 808 | #define STEP(a,b,c,d,w1,w2,fun,r,s,z) \
vector.c: |
vector.c: vector.c:571: note: this is the location of the previous definition
vector.c: 571 | #define STEP(a,b,c,d,w,fun,r,s,z) \
vector.c: ...

Number of similar (compiler,implementation) pairs: 3, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:vect128
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:vect128
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:vect128