Implementation notes: amd64, bolero, crypto_aead/deoxysii256v141

Computer: bolero
Microarchitecture: amd64; Broadwell+AES (406f1)
Architecture: amd64
CPU ID: GenuineIntel-000406f1-1fc9cbf5
SUPERCOP version: 20240625
Operation: crypto_aead
Primitive: deoxysii256v141
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
1163247367 0 064181 784 928T:aesnigcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
1175637451 0 050608 760 896T:aesnigcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
1193244633 0 061300 816 872T:aesniclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
1200044805 0 059605 784 928T:aesnigcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
1236844587 0 058334 808 920T:aesniclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
1238844762 0 057748 816 856T:aesniclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
1380443988 0 058404 776 928T:aesnigcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
1672144601 0 060956 816 872T:aesniclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
117568148110 0 624164877 784 1552T:bitslicegcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
129600126727 0 624141429 784 1552T:bitslicegcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
134272213378 0 624227805 784 1552T:bitslicegcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
145124121146 0 624134224 760 1520T:bitslicegcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
66900033511 0 59247630 808 1520T:tableclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
69044435906 0 59253140 816 1472T:tableclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
70050435906 0 59253452 816 1472T:tableclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
70997235631 0 59251628 816 1456T:tableclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
82965638593 0 62455421 784 1552T:tablegcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
88140436354 0 62451157 784 1552T:tablegcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
97419625585 0 042828 816 872T:refclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
104198034615 0 59248100 816 1456T:tableclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
109373228529 0 045333 784 928T:refgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
135806427249 0 044820 816 872T:refclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
136020426963 0 042980 816 856T:refclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
166410035970 0 62450517 784 1552T:tablegcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
197713934453 0 62447624 760 1520T:tablegcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
227755224235 0 037740 816 856T:refclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
245342023714 0 037822 808 920T:refclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
289744426058 0 040829 784 928T:refgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
483044425466 0 039973 784 928T:refgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
544871624028 0 037192 760 896T:refgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625

Compiler output


deoxys.c: deoxys.c:104:11: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'deoxys_aead_encrypt' that is compiled without support for 'ssse3'
deoxys.c:     tmp = permute( tmp, H_PERMUTATION );
deoxys.c:           ^
deoxys.c: ./tweakable-cipher.macros:7:22: note: expanded from macro 'permute'
deoxys.c: #define permute(a,b) _mm_shuffle_epi8(a,b)
deoxys.c:                      ^
deoxys.c: deoxys.c:112:3: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'deoxys_aead_encrypt' that is compiled without support for 'ssse3'
deoxys.c:   TWEAKEY_SCHEDULE3( tsubkeys1,tsubkeys2,subkeys, key);
deoxys.c:   ^
deoxys.c: ./tweakable-cipher.macros:43:3: note: expanded from macro 'TWEAKEY_SCHEDULE3'
deoxys.c:   ONE_KEY_ROUND( subkeys1[ 0], subkeys1[ 1], subkeys2[ 0], subkeys2[ 1] );      ts[ 1] = xor( xor(subkeys1[ 1],subkeys2[ 1]), RCONS[ 1] ); \
deoxys.c:   ^
deoxys.c: ./tweakable-cipher.macros:34:16: note: expanded from macro 'ONE_KEY_ROUND'
deoxys.c:     new_key1 = permute( new_key1, H_PERMUTATION);\
deoxys.c:                ^
deoxys.c: ./tweakable-cipher.macros:7:22: note: expanded from macro 'permute'
deoxys.c: #define permute(a,b) _mm_shuffle_epi8(a,b)
deoxys.c:                      ^
deoxys.c: deoxys.c:112:3: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'deoxys_aead_encrypt' that is compiled without support for 'ssse3'
deoxys.c: ./tweakable-cipher.macros:43:3: note: expanded from macro 'TWEAKEY_SCHEDULE3'
deoxys.c:   ONE_KEY_ROUND( subkeys1[ 0], subkeys1[ 1], subkeys2[ 0], subkeys2[ 1] );      ts[ 1] = xor( xor(subkeys1[ 1],subkeys2[ 1]), RCONS[ 1] ); \
deoxys.c:   ^
deoxys.c: ./tweakable-cipher.macros:35:16: note: expanded from macro 'ONE_KEY_ROUND'
deoxys.c:     new_key2 = permute( new_key2, H_PERMUTATION);
deoxys.c:                ^
deoxys.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:aesniclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


deoxys.c: deoxys.c:94:16: warning: variable 'Auth' is uninitialized when used here [-Wuninitialized]
deoxys.c:     Auth = xor(Auth, Auth);
deoxys.c:                ^~~~
deoxys.c: ./tweakable-cipher.macros:5:32: note: expanded from macro 'xor'
deoxys.c: #define xor(a,b) _mm_xor_si128(a,b)
deoxys.c:                                ^
deoxys.c: deoxys.c:73:5: note: variable 'Auth' is declared here
deoxys.c:     __m128i Auth;
deoxys.c:     ^
deoxys.c: deoxys.c:95:18: warning: variable 'Tweak' is uninitialized when used here [-Wuninitialized]
deoxys.c:     Tweak = xor( Tweak, Tweak );
deoxys.c:                  ^~~~~
deoxys.c: ./tweakable-cipher.macros:5:32: note: expanded from macro 'xor'
deoxys.c: #define xor(a,b) _mm_xor_si128(a,b)
deoxys.c:                                ^
deoxys.c: deoxys.c:71:5: note: variable 'Tweak' is declared here
deoxys.c:     __m128i Tweak;
deoxys.c:     ^
deoxys.c: deoxys.c:1014:16: warning: variable 'Auth' is uninitialized when used here [-Wuninitialized]
deoxys.c:     Auth = xor(Auth, Auth);
deoxys.c:                ^~~~
deoxys.c: ./tweakable-cipher.macros:5:32: note: expanded from macro 'xor'
deoxys.c: #define xor(a,b) _mm_xor_si128(a,b)
deoxys.c:                                ^
deoxys.c: deoxys.c:992:5: note: variable 'Auth' is declared here
deoxys.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:aesnisclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:aesnisclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:aesnisclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:aesnisclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


deoxys.c: deoxys.c:94:16: warning: variable 'Auth' is uninitialized when used here [-Wuninitialized]
deoxys.c:     Auth = xor(Auth, Auth);
deoxys.c:                ^~~~
deoxys.c: ./tweakable-cipher.macros:5:32: note: expanded from macro 'xor'
deoxys.c: #define xor(a,b) _mm_xor_si128(a,b)
deoxys.c:                                ^
deoxys.c: deoxys.c:73:5: note: variable 'Auth' is declared here
deoxys.c:     __m128i Auth;
deoxys.c:     ^
deoxys.c: deoxys.c:95:18: warning: variable 'Tweak' is uninitialized when used here [-Wuninitialized]
deoxys.c:     Tweak = xor( Tweak, Tweak );
deoxys.c:                  ^~~~~
deoxys.c: ./tweakable-cipher.macros:5:32: note: expanded from macro 'xor'
deoxys.c: #define xor(a,b) _mm_xor_si128(a,b)
deoxys.c:                                ^
deoxys.c: deoxys.c:71:5: note: variable 'Tweak' is declared here
deoxys.c:     __m128i Tweak;
deoxys.c:     ^
deoxys.c: deoxys.c:84:5: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'deoxys_aead_encrypt' that is compiled without support for 'ssse3'
deoxys.c:     TWEAKEY_SCHEDULE3( subkeys, key, tmp,tmp2,tmp3,tmp4 );
deoxys.c:     ^
deoxys.c: ./tweakable-cipher.macros:40:3: note: expanded from macro 'TWEAKEY_SCHEDULE3'
deoxys.c:   ONE_KEY_ROUND( tmp1, tmp2, tmp3, tmp4 );      subkeys[ 1] = xor( xor(tmp2,tmp4), RCONST( 1) ); \
deoxys.c:   ^
deoxys.c: ./tweakable-cipher.macros:33:16: note: expanded from macro 'ONE_KEY_ROUND'
deoxys.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:aesnisclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


deoxys.c: In file included from deoxys.c:30:
deoxys.c: deoxys.c: In function 'deoxys_aead_encrypt':
deoxys.c: tweakable-cipher.macros:5:18: warning: 'Auth' is used uninitialized [-Wuninitialized]
deoxys.c:     5 | #define xor(a,b) _mm_xor_si128(a,b)
deoxys.c:       |                  ^~~~~~~~~~~~~
deoxys.c: deoxys.c:73:13: note: 'Auth' was declared here
deoxys.c:    73 |     __m128i Auth;
deoxys.c:       |             ^~~~
deoxys.c: In file included from deoxys.c:30:
deoxys.c: tweakable-cipher.macros:5:18: warning: 'Tweak' is used uninitialized [-Wuninitialized]
deoxys.c:     5 | #define xor(a,b) _mm_xor_si128(a,b)
deoxys.c:       |                  ^~~~~~~~~~~~~
deoxys.c: deoxys.c:71:13: note: 'Tweak' was declared here
deoxys.c:    71 |     __m128i Tweak;
deoxys.c:       |             ^~~~~
deoxys.c: In file included from deoxys.c:30:
deoxys.c: deoxys.c: In function 'deoxys_aead_decrypt':
deoxys.c: tweakable-cipher.macros:5:18: warning: 'Auth' is used uninitialized [-Wuninitialized]
deoxys.c:     5 | #define xor(a,b) _mm_xor_si128(a,b)
deoxys.c:       |                  ^~~~~~~~~~~~~
deoxys.c: deoxys.c:992:13: note: 'Auth' was declared here
deoxys.c:   992 |     __m128i Auth;
deoxys.c:       |             ^~~~
deoxys.c: In file included from deoxys.c:30:
deoxys.c: tweakable-cipher.macros:5:18: warning: 'Tweak' is used uninitialized [-Wuninitialized]
deoxys.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:aesnisgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:aesnisgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:aesnisgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:aesnisgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Compiler output


deoxysii256.c: deoxysii256.c:312:14: warning: variable 'TEMP' is uninitialized when used here [-Wuninitialized]
deoxysii256.c:     TEMP=XOR(TEMP,TEMP);
deoxysii256.c:              ^~~~
deoxysii256.c: ./deoxysii256.macros:38:39: note: expanded from macro 'XOR'
deoxysii256.c: #define XOR(a,b)        _mm_xor_si128(a,b)
deoxysii256.c:                                       ^
deoxysii256.c: deoxysii256.c:132:5: note: variable 'TEMP' is declared here
deoxysii256.c:     __m128i Tweak, Tweak1, TEMP;
deoxysii256.c:     ^
deoxysii256.c: deoxysii256.c:149:14: warning: variable 'AUTH' is uninitialized when used here [-Wuninitialized]
deoxysii256.c:     AUTH=XOR(AUTH,AUTH);
deoxysii256.c:              ^~~~
deoxysii256.c: ./deoxysii256.macros:38:39: note: expanded from macro 'XOR'
deoxysii256.c: #define XOR(a,b)        _mm_xor_si128(a,b)
deoxysii256.c:                                       ^
deoxysii256.c: deoxysii256.c:134:5: note: variable 'AUTH' is declared here
deoxysii256.c:     __m128i AUTH;
deoxysii256.c:     ^
deoxysii256.c: deoxysii256.c:516:14: warning: variable 'TEMP' is uninitialized when used here [-Wuninitialized]
deoxysii256.c:     TEMP=XOR(TEMP,TEMP);
deoxysii256.c:              ^~~~
deoxysii256.c: ./deoxysii256.macros:38:39: note: expanded from macro 'XOR'
deoxysii256.c: #define XOR(a,b)        _mm_xor_si128(a,b)
deoxysii256.c:                                       ^
deoxysii256.c: deoxysii256.c:408:5: note: variable 'TEMP' is declared here
deoxysii256.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:bitsliceclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:bitsliceclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:bitsliceclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:bitsliceclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


deoxysBCii256.c: deoxysBCii256.c:237:5: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'DeoxysEncrypt_Auth' that is compiled without support for 'ssse3'
deoxysBCii256.c:     packing(a);
deoxysBCii256.c:     ^
deoxysBCii256.c: ./deoxysii256.macros:473:14: note: expanded from macro 'packing'
deoxysBCii256.c:     (x)[0] = shuffle_pack((x)[0]);\
deoxysBCii256.c:              ^
deoxysBCii256.c: ./deoxysii256.macros:32:25: note: expanded from macro 'shuffle_pack'
deoxysBCii256.c: #define shuffle_pack(a) permute(a, SET8(15,11,7,3,14,10,6,2,13,9,5,1,12,8,4,0) )
deoxysBCii256.c:                         ^
deoxysBCii256.c: ./deoxysii256.macros:31:25: note: expanded from macro 'permute'
deoxysBCii256.c: #define permute(a,b)    _mm_shuffle_epi8(a,b)
deoxysBCii256.c:                         ^
deoxysBCii256.c: deoxysBCii256.c:237:5: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'DeoxysEncrypt_Auth' that is compiled without support for 'ssse3'
deoxysBCii256.c: ./deoxysii256.macros:474:14: note: expanded from macro 'packing'
deoxysBCii256.c:     (x)[1] = shuffle_pack((x)[1]);\
deoxysBCii256.c:              ^
deoxysBCii256.c: ./deoxysii256.macros:32:25: note: expanded from macro 'shuffle_pack'
deoxysBCii256.c: #define shuffle_pack(a) permute(a, SET8(15,11,7,3,14,10,6,2,13,9,5,1,12,8,4,0) )
deoxysBCii256.c:                         ^
deoxysBCii256.c: ./deoxysii256.macros:31:25: note: expanded from macro 'permute'
deoxysBCii256.c: #define permute(a,b)    _mm_shuffle_epi8(a,b)
deoxysBCii256.c:                         ^
deoxysBCii256.c: deoxysBCii256.c:237:5: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'DeoxysEncrypt_Auth' that is compiled without support for 'ssse3'
deoxysBCii256.c: ./deoxysii256.macros:475:14: note: expanded from macro 'packing'
deoxysBCii256.c:     (x)[2] = shuffle_pack((x)[2]);\
deoxysBCii256.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:bitsliceclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


deoxysii256.c: In file included from deoxysii256.c:30:
deoxysii256.c: deoxysii256.c: In function 'deoxys_aead_encrypt_8':
deoxysii256.c: deoxysii256.macros:38:25: warning: 'TEMP' is used uninitialized [-Wuninitialized]
deoxysii256.c:    38 | #define XOR(a,b)        _mm_xor_si128(a,b)
deoxysii256.c:       |                         ^~~~~~~~~~~~~
deoxysii256.c: deoxysii256.c:132:28: note: 'TEMP' was declared here
deoxysii256.c:   132 |     __m128i Tweak, Tweak1, TEMP;
deoxysii256.c:       |                            ^~~~
deoxysii256.c: In file included from deoxysii256.c:30:
deoxysii256.c: deoxysii256.c: In function 'deoxys_aead_decrypt_8':
deoxysii256.c: deoxysii256.macros:38:25: warning: 'TEMP' is used uninitialized [-Wuninitialized]
deoxysii256.c:    38 | #define XOR(a,b)        _mm_xor_si128(a,b)
deoxysii256.c:       |                         ^~~~~~~~~~~~~
deoxysii256.c: deoxysii256.c:408:28: note: 'TEMP' was declared here
deoxysii256.c:   408 |     __m128i Tweak, Tweak1, TEMP;
deoxysii256.c:       |                            ^~~~

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:bitslicegcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:bitslicegcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:bitslicegcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:bitslicegcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)