Implementation notes: amd64, dali, crypto_hash/simd512
Computer: dali
Microarchitecture: amd64; Zen (820f01)
Architecture: amd64
CPU ID: AuthenticAMD-00820f01-178bfbff
SUPERCOP version: 20240625
Operation: crypto_hash
Primitive: simd512
Time | Object size | Test size | Implementation | Compiler | Benchmark date | SUPERCOP version |
12622 | 23305 416 0 | 35691 1228 952 | T:vect128 | gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
13288 | 16591 416 0 | 27555 1228 952 | T:vect128 | gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
13322 | 16699 416 0 | 28115 1228 952 | T:vect128 | gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
13683 | 16190 416 0 | 26134 1204 920 | T:vect128 | gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
43654 | 46791 416 0 | 59195 1228 952 | T:opt | gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
53695 | 52784 0 0 | 65008 780 952 | T:sphlib | gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
64586 | 48226 0 0 | 60872 812 920 | T:sphlib | clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
65027 | 44830 0 0 | 57336 812 920 | T:sphlib | clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
67341 | 30288 0 0 | 42536 780 952 | T:sphlib-small | gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
68611 | 55379 0 0 | 68328 812 888 | T:sphlib | clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
69740 | 44033 0 0 | 54382 804 888 | T:sphlib | clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
69747 | 32736 388 0 | 44239 1232 920 | T:opt | clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
69947 | 50035 0 0 | 61288 780 952 | T:sphlib | gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
75168 | 31028 0 0 | 43768 812 920 | T:sphlib-small | clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
75393 | 27102 388 0 | 39351 1232 920 | T:opt | clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
75819 | 27700 0 0 | 40296 812 920 | T:sphlib-small | clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
78777 | 23766 0 0 | 34206 804 888 | T:sphlib-small | clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
79212 | 46584 0 0 | 56371 756 920 | T:sphlib | gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
81315 | 31987 388 0 | 44415 1232 888 | T:opt | clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
83208 | 34839 0 0 | 47832 812 888 | T:sphlib-small | clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
84755 | 49564 0 0 | 60392 780 952 | T:sphlib | gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
87009 | 28091 0 0 | 39368 780 952 | T:sphlib-small | gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
88767 | 27074 0 0 | 37838 804 888 | T:sphlib-small | clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
91171 | 27755 0 0 | 38600 780 952 | T:sphlib-small | gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
91995 | 17643 388 0 | 28037 1224 888 | T:opt | clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
104113 | 25566 0 0 | 35379 756 920 | T:sphlib-small | gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
108570 | 50870 0 0 | 61590 804 888 | T:sphlib | clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
110795 | 14576 416 0 | 26003 1228 952 | T:opt | gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
118435 | 12853 416 0 | 22798 1204 920 | T:opt | gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
122406 | 15555 388 0 | 26421 1224 888 | T:opt | clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
124912 | 14806 416 0 | 25795 1228 952 | T:opt | gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
6598023 | 12864 416 0 | 25275 1228 952 | T:ref | gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
6602380 | 5176 416 0 | 16099 1228 952 | T:ref | gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
6632018 | 5515 416 0 | 16883 1228 952 | T:ref | gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
6639477 | 16081 388 0 | 29199 1232 888 | T:ref | clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
6740606 | 13968 388 0 | 26991 1232 920 | T:ref | clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
6783806 | 12096 388 0 | 24959 1232 920 | T:ref | clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
6796594 | 5075 388 0 | 15893 1224 888 | T:ref | clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
6830471 | 5336 388 0 | 15789 1224 888 | T:ref | clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
14239139 | 4534 416 0 | 14422 1204 920 | T:ref | gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240628 | 20240625 |
Compiler output
optimized.c: optimized.c:437:9: warning: unused variable 'j' [-Wunused-variable]
optimized.c: int i,j;
optimized.c: ^
optimized.c: 1 warning generated.
Number of similar (implementation,compiler) pairs: 5, namely:
Implementation | Compiler |
T:opt | clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
T:opt | clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
T:opt | clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
T:opt | clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
T:opt | clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
Compiler output
optimized.c: optimized.c: In function 'SIMD_Compress':
optimized.c: optimized.c:437:9: warning: unused variable 'j' [-Wunused-variable]
optimized.c: 437 | int i,j;
optimized.c: | ^
Number of similar (implementation,compiler) pairs: 4, namely:
Implementation | Compiler |
T:opt | gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
T:opt | gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
T:opt | gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
T:opt | gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
Compiler output
reference.c: reference.c:69:82: warning: expression result unused [-Wunused-value]
reference.c: state->A[j] = state->D[j] + w[j] + F(state->A[j], state->B[j], state->C[j]), s;
reference.c: ^
reference.c: 1 warning generated.
Number of similar (implementation,compiler) pairs: 5, namely:
Implementation | Compiler |
T:ref | clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
T:ref | clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
T:ref | clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
T:ref | clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
T:ref | clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
Compiler output
reference.c: reference.c: In function 'Step':
reference.c: reference.c:69:80: warning: right-hand operand of comma expression has no effect [-Wunused-value]
reference.c: 69 | state->A[j] = state->D[j] + w[j] + F(state->A[j], state->B[j], state->C[j]), s;
reference.c: | ^
Number of similar (implementation,compiler) pairs: 4, namely:
Implementation | Compiler |
T:ref | gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
T:ref | gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
T:ref | gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
T:ref | gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
Compiler output
vector.c: vector.c:73:9: warning: 'X' macro redefined [-Wmacro-redefined]
vector.c: #define X(i) X##i
vector.c: ^
vector.c: vector.c:68:9: note: previous definition is here
vector.c: #define X(i) A[i]
vector.c: ^
vector.c: vector.c:129:3: error: use of unknown builtin '__builtin_ia32_pcmpgtw128' [-Wimplicit-function-declaration]
vector.c: DO_REDUCE_FULL_S(0);
vector.c: ^
vector.c: vector.c:56:12: note: expanded from macro 'DO_REDUCE_FULL_S'
vector.c: X(i) = EXTRA_REDUCE_S(X(i)); \
vector.c: ^
vector.c: vector.c:42:32: note: expanded from macro 'EXTRA_REDUCE_S'
vector.c: v16_sub(x, v16_and(V257.v16, v16_cmp(x, V128.v16)))
vector.c: ^
vector.c: ./vector.h:92:22: note: expanded from macro 'v16_cmp'
vector.c: #define v16_cmp __builtin_ia32_pcmpgtw128
vector.c: ^
vector.c: vector.c:129:3: error: cannot convert between scalar type 'int' and vector type 'v16' (aka 'v8hi') as implicit conversion would cause truncation
vector.c: vector.c:56:12: note: expanded from macro 'DO_REDUCE_FULL_S'
vector.c: X(i) = EXTRA_REDUCE_S(X(i)); \
vector.c: ^
vector.c: vector.c:42:14: note: expanded from macro 'EXTRA_REDUCE_S'
vector.c: v16_sub(x, v16_and(V257.v16, v16_cmp(x, V128.v16)))
vector.c: ^
vector.c: ...
Number of similar (implementation,compiler) pairs: 5, namely:
Implementation | Compiler |
T:vect128 | clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
T:vect128 | clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
T:vect128 | clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
T:vect128 | clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
T:vect128 | clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
Compiler output
vector.c: vector.c: In function 'fft64':
vector.c: vector.c:73: warning: "X" redefined
vector.c: 73 | #define X(i) X##i
vector.c: |
vector.c: vector.c:68: note: this is the location of the previous definition
vector.c: 68 | #define X(i) A[i]
vector.c: |
vector.c: vector.c: In function 'fft128_msg_final':
vector.c: vector.c:326:7: warning: unused variable 'i' [-Wunused-variable]
vector.c: 326 | int i;
vector.c: | ^
vector.c: vector.c: In function 'rounds512':
vector.c: vector.c:796: warning: "STEP_1" redefined
vector.c: 796 | #define STEP_1(a,b,c,d,w,fun,r,s,z) \
vector.c: |
vector.c: vector.c:542: note: this is the location of the previous definition
vector.c: 542 | #define STEP_1(a,b,c,d,w,fun,r,s,z) \
vector.c: |
vector.c: vector.c:805: warning: "STEP_2" redefined
vector.c: 805 | #define STEP_2(a,b,c,d,w,fun,r,s) \
vector.c: |
vector.c: vector.c:566: note: this is the location of the previous definition
vector.c: 566 | #define STEP_2(a,b,c,d,w,fun,r,s) \
vector.c: |
vector.c: vector.c:808: warning: "STEP" redefined
vector.c: ...
Number of similar (implementation,compiler) pairs: 4, namely:
Implementation | Compiler |
T:vect128 | gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
T:vect128 | gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
T:vect128 | gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
T:vect128 | gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |