Implementation notes: amd64, comet, crypto_aead/romulusm

Computer: comet
Microarchitecture: amd64; Comet Lake (806ec)
Architecture: amd64
CPU ID: GenuineIntel-000806ec-bfebfbff
SUPERCOP version: 20240425
Operation: crypto_aead
Primitive: romulusm
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
26491010949 0 023551 756 1056aadomn/x86gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042820240425
26689714143 0 032185 852 1088aadomn/x86clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042820240425
26702510958 0 024913 852 1024aadomn/x86clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042820240425
26751110452 0 025175 844 1088aadomn/x86clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042820240425
26754413967 0 031713 852 1056aadomn/x86clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042820240425
26996511965 0 026260 780 1088aadomn/x86gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042820240425
26999913357 0 029540 780 1088aadomn/x86gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042820240425
27755311541 0 025363 772 1088aadomn/x86gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042820240425
101297635857 640 053465 1596 1088aadomn/opt32clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042820240425
101391035441 640 052753 1596 1056aadomn/opt32clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042820240425
108288331638 640 047561 1500 1024aadomn/opt32clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042820240425
139536923438 640 037537 1500 1024aadomn/opt32clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042820240425
149573921986 640 036052 1428 1088aadomn/opt32gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042820240425
156451320211 640 032951 1404 1056aadomn/opt32gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042820240425
156941619391 12 035777 864 1024T:refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042820240425
160193319899 640 034503 1492 1088aadomn/opt32clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042820240425
164129332090 640 048460 1428 1088aadomn/opt32gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042820240425
172666617443 12 035065 864 1056T:refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042820240425
172701121449 12 039345 864 1088T:refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042820240425
175111024941 640 039420 1428 1088aadomn/opt32gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042820240425
211283228031 12 044380 792 1088T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042820240425
57927598150 12 022580 792 1088T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042820240425
61748995336 12 018047 768 1056T:refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042820240425
619853710494 12 024569 864 1024T:refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042820240425
63363666854 12 021711 856 1088T:refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042820240425
75973076082 12 020027 784 1088T:refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042820240425

Test failure

Implementation: T:fixslice_opt32
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
error 111
crypto_aead_decrypt returns nonzero

Number of similar (compiler,implementation) pairs: 9, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:fixslice_opt32
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:fixslice_opt32
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:fixslice_opt32
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:fixslice_opt32
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:fixslice_opt32
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:fixslice_opt32
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:fixslice_opt32
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:fixslice_opt32
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:fixslice_opt32

Test failure

Implementation: T:opt32t
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
error 111
crypto_aead_decrypt allows trivial forgeries

Number of similar (compiler,implementation) pairs: 9, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:opt32t
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:opt32t
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:opt32t
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:opt32t
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:opt32t
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:opt32t
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:opt32t
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:opt32t
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:opt32t

Compiler output

Implementation: aadomn/x86
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
skinny128.c: skinny128.c:115:5: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'skinny128_384_plus' that is compiled without support for 'ssse3'
skinny128.c: DOUBLE_ROUND(rtk_23);
skinny128.c: ^
skinny128.c: skinny128.c:78:5: note: expanded from macro 'DOUBLE_ROUND'
skinny128.c: SBOX_ARK_EVEN(rtk_23); \
skinny128.c: ^
skinny128.c: skinny128.c:23:13: note: expanded from macro 'SBOX_ARK_EVEN'
skinny128.c: state = _mm_shuffle_epi8(s1, state); /* apply inner S-box S1 */ \
skinny128.c: ^
skinny128.c: skinny128.c:115:5: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'skinny128_384_plus' that is compiled without support for 'ssse3'
skinny128.c: skinny128.c:78:5: note: expanded from macro 'DOUBLE_ROUND'
skinny128.c: SBOX_ARK_EVEN(rtk_23); \
skinny128.c: ^
skinny128.c: skinny128.c:24:13: note: expanded from macro 'SBOX_ARK_EVEN'
skinny128.c: tmp0 = _mm_shuffle_epi8(s0, tmp0); /* apply inner S-box S0 */ \
skinny128.c: ^
skinny128.c: skinny128.c:115:5: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'skinny128_384_plus' that is compiled without support for 'ssse3'
skinny128.c: skinny128.c:78:5: note: expanded from macro 'DOUBLE_ROUND'
skinny128.c: SBOX_ARK_EVEN(rtk_23); \
skinny128.c: ^
skinny128.c: skinny128.c:32:13: note: expanded from macro 'SBOX_ARK_EVEN'
skinny128.c: tmp0 = _mm_shuffle_epi8(s3, tmp0); /* apply inner S-box S3 */ \
skinny128.c: ^
skinny128.c: skinny128.c:115:5: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'skinny128_384_plus' that is compiled without support for 'ssse3'
skinny128.c: skinny128.c:78:5: note: expanded from macro 'DOUBLE_ROUND'
skinny128.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE aadomn/x86

Compiler output

Implementation: T:fixslice_opt32
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE
decrypt.c: decrypt.c: In function 'crypto_aead_romulusm_fixslice_opt32_timingleaks_decrypt':
decrypt.c: decrypt.c:56:22: warning: writing 1 byte into a region of size 0 [-Wstringop-overflow=]
decrypt.c: 56 | state[i] ^= m[i];
decrypt.c: | ~~~~~~~~~^~~~~~~
decrypt.c: decrypt.c:24:8: note: at offset 16 into destination object 'state' of size 16
decrypt.c: 24 | u8 state[BLOCKBYTES], pad[BLOCKBYTES];
decrypt.c: | ^~~~~

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:fixslice_opt32