Implementation notes: amd64, rumba7, crypto_aead/aezv3

Computer: rumba7
Microarchitecture: amd64; Zen (800f11)
Architecture: amd64
CPU ID: AuthenticAMD-00800f11-178bfbff
SUPERCOP version: 20240625
Operation: crypto_aead
Primitive: aezv3
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
237210708 0 033379 844 1056T:aesniclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
238610708 0 033755 844 1056T:aesniclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
24279614 0 030117 836 1088T:aesniclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
247210412 0 029931 844 1024T:aesniclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
248717372 0 040693 804 1088T:aesnigcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
249610390 0 032485 804 1088T:aesnigcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
26199137 0 028672 780 1056T:aesnigcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
278210673 0 031796 796 1088T:aesnigcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
79168332375 0 055883 860 1056T:refclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
80586230031 0 053163 860 1056T:refclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
85764538063 0 060603 860 1024T:refclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
103254554694 0 078348 828 1088T:refgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
108268222094 0 042909 852 1088T:refclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
109096023264 0 043243 860 1024T:refclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
116411726774 0 049140 828 1088T:refgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
126336125890 0 047364 828 1088T:refgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
141267322868 0 042615 804 1056T:refgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625

Compiler output


aez_ni.c: aez_ni.c:146:22: error: '__builtin_ia32_vec_set_v16qi' needs target feature sse4.1
aez_ni.c:         __m128i i1 = _mm_insert_epi8(zero, 1, 7);
aez_ni.c:                      ^
aez_ni.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/smmintrin.h:930:13: note: expanded from macro '_mm_insert_epi8'
aez_ni.c:   ((__m128i)__builtin_ia32_vec_set_v16qi((__v16qi)(__m128i)(X), \
aez_ni.c:             ^
aez_ni.c: aez_ni.c:147:22: error: '__builtin_ia32_vec_set_v16qi' needs target feature sse4.1
aez_ni.c:         __m128i i2 = _mm_insert_epi8(zero, 2, 7);
aez_ni.c:                      ^
aez_ni.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/smmintrin.h:930:13: note: expanded from macro '_mm_insert_epi8'
aez_ni.c:   ((__m128i)__builtin_ia32_vec_set_v16qi((__v16qi)(__m128i)(X), \
aez_ni.c:             ^
aez_ni.c: aez_ni.c:148:22: error: '__builtin_ia32_vec_set_v16qi' needs target feature sse4.1
aez_ni.c:         __m128i i3 = _mm_insert_epi8(zero, 3, 7);
aez_ni.c:                      ^
aez_ni.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/smmintrin.h:930:13: note: expanded from macro '_mm_insert_epi8'
aez_ni.c:   ((__m128i)__builtin_ia32_vec_set_v16qi((__v16qi)(__m128i)(X), \
aez_ni.c:             ^
aez_ni.c: aez_ni.c:149:26: error: '__builtin_ia32_vec_set_v16qi' needs target feature sse4.1
aez_ni.c:         __m128i j, one = _mm_insert_epi8(zero, 1, 15);
aez_ni.c:                          ^
aez_ni.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/smmintrin.h:930:13: note: expanded from macro '_mm_insert_epi8'
aez_ni.c:   ((__m128i)__builtin_ia32_vec_set_v16qi((__v16qi)(__m128i)(X), \
aez_ni.c:             ^
aez_ni.c: 4 errors generated.

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:aesniclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


aez_ni.c: In function 'load_partial',
aez_ni.c:     inlined from 'load_partial' at aez_ni.c:119:16,
aez_ni.c:     inlined from 'cipher_aez_tiny' at aez_ni.c:498:18,
aez_ni.c:     inlined from 'aez_encrypt' at aez_ni.c:588:9,
aez_ni.c:     inlined from 'crypto_aead_aezv3_aesni_timingleaks_encrypt' at aez_ni.c:637:5:
aez_ni.c: aez_ni.c:123:46: warning: '__builtin_memcpy' forming offset [16, 4294967263] is out of the bounds [0, 16] of object 'tmp' with type '__m128i' [-Warray-bounds]
aez_ni.c:   123 |         for (i=0; i<n; i++) ((char*)&tmp)[i] = ((char*)p)[i];
aez_ni.c:       |                             ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
aez_ni.c: aez_ni.c: In function 'crypto_aead_aezv3_aesni_timingleaks_encrypt':
aez_ni.c: aez_ni.c:122:17: note: 'tmp' declared here
aez_ni.c:   122 |         __m128i tmp; unsigned i;
aez_ni.c:       |                 ^~~

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:aesnigcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)