Implementation notes: amd64, comet, crypto_hash/romulush

Computer: comet
Microarchitecture: amd64; Comet Lake (806ec)
Architecture: amd64
CPU ID: GenuineIntel-000806ec-bfebfbff
SUPERCOP version: 20240625
Operation: crypto_hash
Primitive: romulush
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
672318141 0 022361 852 928T:x86clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
684668039 0 018561 852 896T:x86clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
691679288 0 023817 852 960T:x86clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
692157585 0 018951 844 960T:x86clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
713028285 0 017607 756 928T:x86gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
718619559 0 022300 780 960T:x86gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
734958463 0 019252 780 960T:x86gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
907248513 0 018859 772 960T:x86gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
1303179045 592 023617 1452 960T:opt32tclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
1305219029 592 023305 1452 928T:opt32tclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
1313948123 592 018689 1452 896T:opt32tclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
1315678720 592 021681 1452 896T:opt32tclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
1358218891 608 019684 1396 960T:opt32tgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
13720010024 608 022764 1396 960T:opt32tgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
1396348501 608 017791 1372 928T:opt32tgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
1409357866 592 019231 1444 960T:opt32tclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
1415509051 608 019379 1388 960T:opt32tgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
24699229557 640 043377 1596 928T:opt32clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
24761930783 640 044889 1596 960T:opt32clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
26635327136 640 039561 1500 896T:opt32clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
34697719841 640 030553 1500 896T:opt32clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
37224517114 640 026559 1404 928T:opt32gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
37690018444 640 028996 1428 960T:opt32gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
3856204672 12 017681 864 896T:refclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
40086716629 640 027903 1492 960T:opt32clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
40676127892 640 040788 1428 960T:opt32gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
4205165518 12 019873 864 928T:refclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
4210395484 12 020121 864 960T:refclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
43244820751 640 031692 1428 960T:opt32gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
5323546032 12 018836 792 960T:refgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
13573552440 12 013276 792 960T:refgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
13703472253 12 013695 856 960T:refclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
16188952790 12 013417 864 896T:refclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
16538481972 12 011351 768 928T:refgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
17208402157 12 012571 784 960T:refgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625

Compiler output


tk_schedule.c: tk_schedule.c:377:14: warning: argument 1 of type 'uint32_t[64]' {aka 'unsigned int[64]'} with mismatched bound [-Warray-parameter=]
tk_schedule.c:   377 |     uint32_t rtk_1[TKPERMORDER*BLOCKBYTES/4],
tk_schedule.c:       |     ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
tk_schedule.c: In file included from tk_schedule.c:17:
tk_schedule.c: tk_schedule.h:40:31: note: previously declared as 'uint32_t *' {aka 'unsigned int *'}
tk_schedule.c:    40 | void tk_schedule_13(uint32_t *rtk_1, uint32_t *rtk_3,
tk_schedule.c:       |                     ~~~~~~~~~~^~~~~
tk_schedule.c: tk_schedule.c:378:14: warning: argument 2 of type 'uint32_t[160]' {aka 'unsigned int[160]'} with mismatched bound [-Warray-parameter=]
tk_schedule.c:   378 |     uint32_t rtk_3[SKINNY128_384_ROUNDS*BLOCKBYTES/4],
tk_schedule.c:       |     ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
tk_schedule.c: tk_schedule.h:40:48: note: previously declared as 'uint32_t *' {aka 'unsigned int *'}
tk_schedule.c:    40 | void tk_schedule_13(uint32_t *rtk_1, uint32_t *rtk_3,
tk_schedule.c:       |                                      ~~~~~~~~~~^~~~~
tk_schedule.c: tk_schedule.c:379:19: warning: argument 3 of type 'const uint8_t[16]' {aka 'const unsigned char[16]'} with mismatched bound [-Warray-parameter=]
tk_schedule.c:   379 |     const uint8_t tk_1[TWEAKEYBYTES],
tk_schedule.c:       |     ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~
tk_schedule.c: tk_schedule.h:41:20: note: previously declared as 'const uint8_t *' {aka 'const unsigned char *'}
tk_schedule.c:    41 |     const uint8_t *tk_1,
tk_schedule.c:       |     ~~~~~~~~~~~~~~~^~~~
tk_schedule.c: tk_schedule.c:380:19: warning: argument 4 of type 'const uint8_t[16]' {aka 'const unsigned char[16]'} with mismatched bound [-Warray-parameter=]
tk_schedule.c:   380 |     const uint8_t tk_3[TWEAKEYBYTES])
tk_schedule.c:       |     ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~
tk_schedule.c: tk_schedule.h:42:20: note: previously declared as 'const uint8_t *' {aka 'const unsigned char *'}
tk_schedule.c:    42 |     const uint8_t *tk_3);
tk_schedule.c:       |     ~~~~~~~~~~~~~~~^~~~
tk_schedule.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:opt32gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0)
T:opt32gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0)
T:opt32gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0)
T:opt32gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0)

Compiler output


skinny128.c: skinny128.c:94:12: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'skinny128_384_plus' that is compiled without support for 'ssse3'
skinny128.c:     tk_1 = _mm_shuffle_epi8(tk_1, _mm_set_epi32(0x03040602, 0x05000701, 0x0b0c0e0a, 0x0d080f09));
skinny128.c:            ^
skinny128.c: skinny128.c:96:12: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'skinny128_384_plus' that is compiled without support for 'ssse3'
skinny128.c:     tk_1 = _mm_shuffle_epi8(tk_1, perm_tk);
skinny128.c:            ^
skinny128.c: skinny128.c:98:12: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'skinny128_384_plus' that is compiled without support for 'ssse3'
skinny128.c:     tk_1 = _mm_shuffle_epi8(tk_1, perm_tk);
skinny128.c:            ^
skinny128.c: skinny128.c:100:12: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'skinny128_384_plus' that is compiled without support for 'ssse3'
skinny128.c:     tk_1 = _mm_shuffle_epi8(tk_1, perm_tk);
skinny128.c:            ^
skinny128.c: skinny128.c:102:12: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'skinny128_384_plus' that is compiled without support for 'ssse3'
skinny128.c:     tk_1 = _mm_shuffle_epi8(tk_1, perm_tk);
skinny128.c:            ^
skinny128.c: skinny128.c:104:12: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'skinny128_384_plus' that is compiled without support for 'ssse3'
skinny128.c:     tk_1 = _mm_shuffle_epi8(tk_1, perm_tk);
skinny128.c:            ^
skinny128.c: skinny128.c:106:12: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'skinny128_384_plus' that is compiled without support for 'ssse3'
skinny128.c:     tk_1 = _mm_shuffle_epi8(tk_1, perm_tk);
skinny128.c:            ^
skinny128.c: skinny128.c:108:12: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'skinny128_384_plus' that is compiled without support for 'ssse3'
skinny128.c:     tk_1 = _mm_shuffle_epi8(tk_1, perm_tk);
skinny128.c:            ^
skinny128.c: skinny128.c:112:5: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'skinny128_384_plus' that is compiled without support for 'ssse3'
skinny128.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:x86clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_14.0.6)

Compiler output


skinny128.c: skinny128.c:66:20: warning: argument 1 of type 'unsigned char *' declared as a pointer [-Warray-parameter=]
skinny128.c:    66 |     unsigned char *out,
skinny128.c:       |     ~~~~~~~~~~~~~~~^~~
skinny128.c: In file included from skinny128.c:10:
skinny128.c: skinny128.h:13:17: note: previously declared as an array 'uint8_t[16]' {aka 'unsigned char[16]'}
skinny128.c:    13 |         uint8_t in[BLOCKBYTES], const uint8_t out[BLOCKBYTES],
skinny128.c:       |         ~~~~~~~~^~~~~~~~~~~~~~
skinny128.c: skinny128.c:67:26: warning: argument 2 of type 'const unsigned char *' declared as a pointer [-Warray-parameter=]
skinny128.c:    67 |     const unsigned char *in,
skinny128.c:       |     ~~~~~~~~~~~~~~~~~~~~~^~
skinny128.c: skinny128.h:13:47: note: previously declared as an array 'const uint8_t[16]' {aka 'const unsigned char[16]'}
skinny128.c:    13 |         uint8_t in[BLOCKBYTES], const uint8_t out[BLOCKBYTES],
skinny128.c:       |                                 ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
skinny128.c: skinny128.c:68:26: warning: argument 3 of type 'const unsigned char *' declared as a pointer [-Warray-parameter=]
skinny128.c:    68 |     const unsigned char *tk1,
skinny128.c:       |     ~~~~~~~~~~~~~~~~~~~~~^~~
skinny128.c: skinny128.h:14:23: note: previously declared as an array 'const uint8_t[16]' {aka 'const unsigned char[16]'}
skinny128.c:    14 |         const uint8_t tk1[TWEAKEYBYTES],
skinny128.c:       |         ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
skinny128.c: skinny128.c:69:26: warning: argument 4 of type 'const unsigned char *' declared as a pointer [-Warray-parameter=]
skinny128.c:    69 |     const unsigned char *rtk_23)
skinny128.c:       |     ~~~~~~~~~~~~~~~~~~~~~^~~~~~
skinny128.c: skinny128.h:15:23: note: previously declared as an array 'const uint8_t[320]' {aka 'const unsigned char[320]'}
skinny128.c:    15 |         const uint8_t rtk_23[SKINNY128_384_ROUNDS*BLOCKBYTES/2]);
skinny128.c:       |         ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
skinny128.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:x86gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0)
T:x86gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0)
T:x86gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0)
T:x86gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0)