Implementation notes: amd64, wolfdale, crypto_aead/aeadaes256ocbtaglen128v1

Computer: wolfdale
Microarchitecture: amd64; Core 2 45nm (1067a)
Architecture: amd64
CPU ID: GenuineIntel-0001067a-bfebfbff
SUPERCOP version: 20240625
Operation: crypto_aead
Primitive: aeadaes256ocbtaglen128v1
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
1188416085 18 026483 912 1016T:optclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
1189037905 18 030726 904 1048T:optgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
1189087134 18 028485 896 1048T:optgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
1189277121 18 029845 920 1016T:optclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
1189527137 18 030677 920 1016T:optclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
1190706720 18 030069 920 1016T:optclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
1190827192 18 029086 904 1048T:optgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
1192486465 18 027553 920 1016T:optclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
1218205993 18 025681 880 1048T:optgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
2017145628 0 028550 844 1016T:refclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
2022885388 0 027494 844 1016T:refclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
2528509363 0 031615 828 1048T:refgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
2560149772 0 032502 844 1016T:refclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
2611822227 0 022100 836 1016T:refclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
2901272337 0 022684 836 1016T:refclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
2966142002 0 021170 804 1048T:refgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
2969273507 0 024863 828 1048T:refgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
3021873054 0 023870 820 1048T:refgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625

Compiler output


encrypt.c: encrypt.c:74:34: warning: incompatible pointer types passing 'const unsigned int *' to parameter of type 'const __m128i_u *' [-Wincompatible-pointer-types]
encrypt.c:   __m128i key0 = _mm_loadu_si128((const unsigned int *)(key+0));
encrypt.c:                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
encrypt.c: /usr/lib/llvm-11/lib/clang/11.0.1/include/emmintrin.h:3548:34: note: passing argument to parameter '__p' here
encrypt.c: _mm_loadu_si128(__m128i_u const *__p)
encrypt.c:                                  ^
encrypt.c: encrypt.c:75:34: warning: incompatible pointer types passing 'const unsigned int *' to parameter of type 'const __m128i_u *' [-Wincompatible-pointer-types]
encrypt.c:   __m128i key1 = _mm_loadu_si128((const unsigned int *)(key+16));
encrypt.c:                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
encrypt.c: /usr/lib/llvm-11/lib/clang/11.0.1/include/emmintrin.h:3548:34: note: passing argument to parameter '__p' here
encrypt.c: _mm_loadu_si128(__m128i_u const *__p)
encrypt.c:                                  ^
encrypt.c: encrypt.c:470:2: warning: misleading indentation; statement is not part of the previous 'for' [-Wmisleading-indentation]
encrypt.c:         break;
encrypt.c:         ^
encrypt.c: encrypt.c:468:7: note: previous statement is here
encrypt.c:       for (i = 5; i < ntz ; i++)
encrypt.c:       ^
encrypt.c: encrypt.c:639:25: warning: variable 'sum' is uninitialized when used here [-Wuninitialized]
encrypt.c:     sum = _mm_xor_si128(sum,sum);
encrypt.c:                         ^~~
encrypt.c: encrypt.c:581:5: note: variable 'sum' is declared here
encrypt.c:     __m128i lstar, ldollar, sum, offset, ktop, pad, nonce, tag, tmp, outv;
encrypt.c:     ^
encrypt.c: encrypt.c:125:20: warning: unused function 'aes256ni_setkey_decrypt' [-Wunused-function]
encrypt.c: ...

Number of similar (implementation,compiler) pairs: 5, namely:
ImplementationCompiler
T:dolbeau/aesenc-intclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:dolbeau/aesenc-intclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:dolbeau/aesenc-intclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:dolbeau/aesenc-intclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:dolbeau/aesenc-intclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

Compiler output


encrypt.c: encrypt.c: In function 'aes256ni_setkey_encrypt':
encrypt.c: encrypt.c:74:34: warning: passing argument 1 of '_mm_loadu_si128' from incompatible pointer type [-Wincompatible-pointer-types]
encrypt.c:    74 |   __m128i key0 = _mm_loadu_si128((const unsigned int *)(key+0));
encrypt.c:       |                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
encrypt.c:       |                                  |
encrypt.c:       |                                  const unsigned int *
encrypt.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/xmmintrin.h:1316,
encrypt.c:                  from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:29,
encrypt.c:                  from encrypt.c:45:
encrypt.c: /usr/lib/gcc/x86_64-linux-gnu/10/include/emmintrin.h:701:35: note: expected 'const __m128i_u *' but argument is of type 'const unsigned int *'
encrypt.c:   701 | _mm_loadu_si128 (__m128i_u const *__P)
encrypt.c:       |                  ~~~~~~~~~~~~~~~~~^~~
encrypt.c: encrypt.c:75:34: warning: passing argument 1 of '_mm_loadu_si128' from incompatible pointer type [-Wincompatible-pointer-types]
encrypt.c:    75 |   __m128i key1 = _mm_loadu_si128((const unsigned int *)(key+16));
encrypt.c:       |                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
encrypt.c:       |                                  |
encrypt.c:       |                                  const unsigned int *
encrypt.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/xmmintrin.h:1316,
encrypt.c:                  from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:29,
encrypt.c:                  from encrypt.c:45:
encrypt.c: /usr/lib/gcc/x86_64-linux-gnu/10/include/emmintrin.h:701:35: note: expected 'const __m128i_u *' but argument is of type 'const unsigned int *'
encrypt.c:   701 | _mm_loadu_si128 (__m128i_u const *__P)
encrypt.c:       |                  ~~~~~~~~~~~~~~~~~^~~
encrypt.c: encrypt.c: In function 'aes256ni_setkey_decrypt':
encrypt.c: encrypt.c:130: warning: ignoring '#pragma unroll ' [-Wunknown-pragmas]
encrypt.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:dolbeau/aesenc-intgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:dolbeau/aesenc-intgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:dolbeau/aesenc-intgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:dolbeau/aesenc-intgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

Compiler output


ocb_vaes.c: ocb_vaes.c:668:19: error: always_inline function '_mm256_broadcastsi128_si256' requires target feature 'avx2', but would be inlined into function 'ae_encrypt' that is compiled without support for 'avx2'
ocb_vaes.c:         k256[i] = _mm256_broadcastsi128_si256(load128(ctx->encrypt_key.rd_key+i));
ocb_vaes.c:                   ^
ocb_vaes.c: ocb_vaes.c:668:19: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
ocb_vaes.c: ocb_vaes.c:669:14: error: always_inline function '_mm256_broadcastsi128_si256' requires target feature 'avx2', but would be inlined into function 'ae_encrypt' that is compiled without support for 'avx2'
ocb_vaes.c:     m[M01] = _mm256_broadcastsi128_si256(xor128(load128(ctx->L+0), load128(ctx->L+1)));
ocb_vaes.c:              ^
ocb_vaes.c: ocb_vaes.c:669:14: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
ocb_vaes.c: ocb_vaes.c:670:14: error: always_inline function '_mm256_broadcastsi128_si256' requires target feature 'avx2', but would be inlined into function 'ae_encrypt' that is compiled without support for 'avx2'
ocb_vaes.c:     m[M02] = _mm256_broadcastsi128_si256(xor128(load128(ctx->L+0), load128(ctx->L+2)));
ocb_vaes.c:              ^
ocb_vaes.c: ocb_vaes.c:670:14: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
ocb_vaes.c: ocb_vaes.c:671:14: error: always_inline function '_mm256_broadcastsi128_si256' requires target feature 'avx2', but would be inlined into function 'ae_encrypt' that is compiled without support for 'avx2'
ocb_vaes.c:     m[M03] = _mm256_broadcastsi128_si256(xor128(load128(ctx->L+0), load128(ctx->L+3)));
ocb_vaes.c:              ^
ocb_vaes.c: ocb_vaes.c:671:14: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
ocb_vaes.c: ocb_vaes.c:672:16: error: always_inline function '_mm256_set_m128i' requires target feature 'avx', but would be inlined into function 'ae_encrypt' that is compiled without support for 'avx'
ocb_vaes.c:     m[M0_01] = _mm256_set_m128i(xor128(load128(ctx->L+0), load128(ctx->L+1)), load128(ctx->L+0));
ocb_vaes.c:                ^
ocb_vaes.c: ocb_vaes.c:672:16: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
ocb_vaes.c: ocb_vaes.c:677:20: error: always_inline function '_mm256_setzero_si256' requires target feature 'avx', but would be inlined into function 'ae_encrypt' that is compiled without support for 'avx'
ocb_vaes.c:         checksum = zero256();
ocb_vaes.c:                    ^
ocb_vaes.c: ocb_vaes.c:150:27: note: expanded from macro 'zero256'
ocb_vaes.c: #define zero256           _mm256_setzero_si256
ocb_vaes.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:vaesclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:vaesclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:vaesclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:vaesclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

Compiler output


ocb_vaes.c: ocb_vaes.c:476:15: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'ae_init' that is compiled without support for 'ssse3'
ocb_vaes.c:     tmp_blk = reverse_bytes(load128(&ctx->Lstar));
ocb_vaes.c:               ^
ocb_vaes.c: ocb_vaes.c:155:5: note: expanded from macro 'reverse_bytes'
ocb_vaes.c:     _mm_shuffle_epi8(b,_mm_set_epi8(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15))
ocb_vaes.c:     ^
ocb_vaes.c: ocb_vaes.c:478:29: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'ae_init' that is compiled without support for 'ssse3'
ocb_vaes.c:     store128(&ctx->Ldollar, reverse_bytes(tmp_blk));
ocb_vaes.c:                             ^
ocb_vaes.c: ocb_vaes.c:155:5: note: expanded from macro 'reverse_bytes'
ocb_vaes.c:     _mm_shuffle_epi8(b,_mm_set_epi8(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15))
ocb_vaes.c:     ^
ocb_vaes.c: ocb_vaes.c:480:24: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'ae_init' that is compiled without support for 'ssse3'
ocb_vaes.c:     store128(ctx->L+0, reverse_bytes(tmp_blk));
ocb_vaes.c:                        ^
ocb_vaes.c: ocb_vaes.c:155:5: note: expanded from macro 'reverse_bytes'
ocb_vaes.c:     _mm_shuffle_epi8(b,_mm_set_epi8(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15))
ocb_vaes.c:     ^
ocb_vaes.c: ocb_vaes.c:483:25: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'ae_init' that is compiled without support for 'ssse3'
ocb_vaes.c:         store128(ctx->L+i, reverse_bytes(tmp_blk));
ocb_vaes.c:                            ^
ocb_vaes.c: ocb_vaes.c:155:5: note: expanded from macro 'reverse_bytes'
ocb_vaes.c:     _mm_shuffle_epi8(b,_mm_set_epi8(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15))
ocb_vaes.c:     ^
ocb_vaes.c: 4 errors generated.

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:vaesclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

Compiler output


ocb_vaes.c: ocb_vaes.c: In function 'ae_encrypt':
ocb_vaes.c: ocb_vaes.c:668:17: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
ocb_vaes.c:   668 |         k256[i] = _mm256_broadcastsi128_si256(load128(ctx->encrypt_key.rd_key+i));
ocb_vaes.c:       |         ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ocb_vaes.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:39,
ocb_vaes.c:                  from ocb_vaes.c:71:
ocb_vaes.c: ocb_vaes.c: In function 'AES_128_Key_Expansion':
ocb_vaes.c: /usr/lib/gcc/x86_64-linux-gnu/10/include/wmmintrin.h:87:1: error: inlining failed in call to 'always_inline' '_mm_aeskeygenassist_si128': target specific option mismatch
ocb_vaes.c:    87 | _mm_aeskeygenassist_si128 (__m128i __X, const int __C)
ocb_vaes.c:       | ^~~~~~~~~~~~~~~~~~~~~~~~~
ocb_vaes.c: ocb_vaes.c:181:10: note: called from here
ocb_vaes.c:   181 |     v2 = _mm_aeskeygenassist_si128(v4,aes_const);                           \
ocb_vaes.c:       |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ocb_vaes.c: ocb_vaes.c:204:5: note: in expansion of macro 'EXPAND_ASSIST'
ocb_vaes.c:   204 |     EXPAND_ASSIST(x0,x1,x2,x0,255,54);  store128((__m128i *)key+10, x0);
ocb_vaes.c:       |     ^~~~~~~~~~~~~
ocb_vaes.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:39,
ocb_vaes.c:                  from ocb_vaes.c:71:
ocb_vaes.c: /usr/lib/gcc/x86_64-linux-gnu/10/include/wmmintrin.h:87:1: error: inlining failed in call to 'always_inline' '_mm_aeskeygenassist_si128': target specific option mismatch
ocb_vaes.c:    87 | _mm_aeskeygenassist_si128 (__m128i __X, const int __C)
ocb_vaes.c:       | ^~~~~~~~~~~~~~~~~~~~~~~~~
ocb_vaes.c: ocb_vaes.c:181:10: note: called from here
ocb_vaes.c:   181 |     v2 = _mm_aeskeygenassist_si128(v4,aes_const);                           \
ocb_vaes.c:       |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ocb_vaes.c: ocb_vaes.c:203:5: note: in expansion of macro 'EXPAND_ASSIST'
ocb_vaes.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:vaesgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:vaesgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:vaesgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:vaesgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)