Implementation notes: amd64, dali, crypto_aead/deoxysi128v141

Computer: dali
Microarchitecture: amd64; Zen (820f01)
Architecture: amd64
CPU ID: AuthenticAMD-00820f01-178bfbff
SUPERCOP version: 20240625
Operation: crypto_aead
Primitive: deoxysi128v141
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
656728610 0 044096 780 1080T:aesnigcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
684825013 0 037931 756 1048T:aesnigcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
688027511 0 041672 780 1080T:aesnigcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
706329322 0 044472 812 1048T:aesniclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
706629434 0 044728 812 1048T:aesniclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
716328264 0 041670 804 1016T:aesniclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
726327872 0 041070 804 1016T:aesniclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
8331123809 0 0139184 812 1048T:aesnisclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
8459123823 0 0137150 804 1016T:aesnisclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
8494123664 0 0136686 804 1016T:aesnisclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
8530123818 0 0139328 812 1048T:aesnisclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
8760120002 0 0135552 780 1080T:aesnisgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
8801120069 0 0134624 780 1080T:aesnisgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
890427690 0 042208 780 1080T:aesnigcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
8980115988 0 0130184 780 1080T:aesnisgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
9037109668 0 0122587 756 1048T:aesnisgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
6010796095 0 547110784 812 1592T:bitsliceclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
6052496095 0 547110656 812 1592T:bitsliceclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
68853114978 0 592130488 780 1688T:bitslicegcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
7176895191 0 547107486 804 1592T:bitsliceclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
7367394288 0 547107198 804 1592T:bitsliceclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
7467295956 0 592110520 780 1688T:bitslicegcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
77826148879 0 592163088 780 1688T:bitslicegcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
8622490868 0 592103787 756 1656T:bitslicegcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
12364634053 0 54747622 804 1592T:tableclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
12865534539 0 54750488 812 1592T:tableclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
12879734571 0 54750648 812 1592T:tableclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
12898432967 0 59245947 756 1656T:tablegcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
13029735957 0 59250536 780 1688T:tablegcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
13080035004 0 54751368 812 1592T:tableclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
13382537511 0 59253064 780 1688T:tablegcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
13472134916 0 59249152 780 1688T:tablegcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
13838133332 0 54747270 804 1592T:tableclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
51862224302 0 040248 812 1048T:refclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
67553929829 0 045392 780 1080T:refgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
80073426254 0 042344 812 1048T:refclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
81325726664 0 043064 812 1016T:refclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
119081826055 0 040656 780 1080T:refgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
193512124015 0 037606 804 1016T:refclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
213322022850 0 036790 804 1016T:refclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
233062322462 0 035427 756 1048T:refgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625
235397924303 0 038552 780 1080T:refgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062820240625

Compiler output


deoxys.c: deoxys.c:98:13: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'deoxys_aead_encrypt' that is compiled without support for 'ssse3'
deoxys.c:       tmp = permute( tmp, H_PERMUTATION );
deoxys.c:             ^
deoxys.c: ./tweakable-cipher.macros:7:22: note: expanded from macro 'permute'
deoxys.c: #define permute(a,b) _mm_shuffle_epi8(a,b)
deoxys.c:                      ^
deoxys.c: deoxys.c:105:5: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'deoxys_aead_encrypt' that is compiled without support for 'ssse3'
deoxys.c:     TWEAKEY_SCHEDULE2( tsubkeys,subkeys, key);
deoxys.c:     ^
deoxys.c: ./tweakable-cipher.macros:39:3: note: expanded from macro 'TWEAKEY_SCHEDULE2'
deoxys.c:   ONE_KEY_ROUND( subkeys[ 0], subkeys[ 1] );    ts[ 1] = xor( subkeys[ 1], RCONS[ 1] ); \
deoxys.c:   ^
deoxys.c: ./tweakable-cipher.macros:35:13: note: expanded from macro 'ONE_KEY_ROUND'
deoxys.c:   new_key = permute( new_key, H_PERMUTATION);
deoxys.c:             ^
deoxys.c: ./tweakable-cipher.macros:7:22: note: expanded from macro 'permute'
deoxys.c: #define permute(a,b) _mm_shuffle_epi8(a,b)
deoxys.c:                      ^
deoxys.c: deoxys.c:105:5: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'deoxys_aead_encrypt' that is compiled without support for 'ssse3'
deoxys.c: ./tweakable-cipher.macros:40:3: note: expanded from macro 'TWEAKEY_SCHEDULE2'
deoxys.c:   ONE_KEY_ROUND( subkeys[ 1], subkeys[ 2] );    ts[ 2] = xor( subkeys[ 2], RCONS[ 2] ); \
deoxys.c:   ^
deoxys.c: ./tweakable-cipher.macros:35:13: note: expanded from macro 'ONE_KEY_ROUND'
deoxys.c:   new_key = permute( new_key, H_PERMUTATION);
deoxys.c:             ^
deoxys.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:aesniclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

Compiler output


deoxys.c: deoxys.c:346:20: warning: variable 'Checksum' is uninitialized when used here [-Wuninitialized]
deoxys.c:     Checksum = xor(Checksum, Checksum);
deoxys.c:                    ^~~~~~~~
deoxys.c: ./tweakable-cipher.macros:5:32: note: expanded from macro 'xor'
deoxys.c: #define xor(a,b) _mm_xor_si128(a,b)
deoxys.c:                                ^
deoxys.c: deoxys.c:72:5: note: variable 'Checksum' is declared here
deoxys.c:     __m128i Checksum;
deoxys.c:     ^
deoxys.c: deoxys.c:90:16: warning: variable 'Auth' is uninitialized when used here [-Wuninitialized]
deoxys.c:     Auth = xor(Auth, Auth);
deoxys.c:                ^~~~
deoxys.c: ./tweakable-cipher.macros:5:32: note: expanded from macro 'xor'
deoxys.c: #define xor(a,b) _mm_xor_si128(a,b)
deoxys.c:                                ^
deoxys.c: deoxys.c:71:5: note: variable 'Auth' is declared here
deoxys.c:     __m128i Auth;
deoxys.c:     ^
deoxys.c: deoxys.c:91:17: warning: variable 'Tweak' is uninitialized when used here [-Wuninitialized]
deoxys.c:     Tweak = xor(Tweak, Tweak);
deoxys.c:                 ^~~~~
deoxys.c: ./tweakable-cipher.macros:5:32: note: expanded from macro 'xor'
deoxys.c: #define xor(a,b) _mm_xor_si128(a,b)
deoxys.c:                                ^
deoxys.c: deoxys.c:69:5: note: variable 'Tweak' is declared here
deoxys.c: ...

Number of similar (implementation,compiler) pairs: 5, namely:
ImplementationCompiler
T:aesnisclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:aesnisclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:aesnisclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:aesnisclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:aesnisclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

Compiler output


deoxys.c: In file included from deoxys.c:30:
deoxys.c: deoxys.c: In function 'deoxys_aead_encrypt':
deoxys.c: tweakable-cipher.macros:5:18: warning: 'Auth' is used uninitialized in this function [-Wuninitialized]
deoxys.c:     5 | #define xor(a,b) _mm_xor_si128(a,b)
deoxys.c:       |                  ^~~~~~~~~~~~~
deoxys.c: deoxys.c:71:13: note: 'Auth' was declared here
deoxys.c:    71 |     __m128i Auth;
deoxys.c:       |             ^~~~
deoxys.c: In file included from deoxys.c:30:
deoxys.c: tweakable-cipher.macros:5:18: warning: 'Tweak' is used uninitialized in this function [-Wuninitialized]
deoxys.c:     5 | #define xor(a,b) _mm_xor_si128(a,b)
deoxys.c:       |                  ^~~~~~~~~~~~~
deoxys.c: deoxys.c:69:13: note: 'Tweak' was declared here
deoxys.c:    69 |     __m128i Tweak;
deoxys.c:       |             ^~~~~
deoxys.c: In file included from deoxys.c:30:
deoxys.c: tweakable-cipher.macros:5:18: warning: 'Checksum' is used uninitialized in this function [-Wuninitialized]
deoxys.c:     5 | #define xor(a,b) _mm_xor_si128(a,b)
deoxys.c:       |                  ^~~~~~~~~~~~~
deoxys.c: deoxys.c:72:13: note: 'Checksum' was declared here
deoxys.c:    72 |     __m128i Checksum;
deoxys.c:       |             ^~~~~~~~
deoxys.c: In file included from deoxys.c:30:
deoxys.c: deoxys.c: In function 'deoxys_aead_decrypt':
deoxys.c: tweakable-cipher.macros:5:18: warning: 'Auth' is used uninitialized in this function [-Wuninitialized]
deoxys.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:aesnisgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:aesnisgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:aesnisgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)
T:aesnisgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110)

Compiler output


deoxys_8.c: deoxys_8.c:248:18: warning: variable 'CHECKSUM' is uninitialized when used here [-Wuninitialized]
deoxys_8.c:     CHECKSUM=XOR(CHECKSUM,CHECKSUM);
deoxys_8.c:                  ^~~~~~~~
deoxys_8.c: ./deoxys.macros:38:39: note: expanded from macro 'XOR'
deoxys_8.c: #define XOR(a,b)        _mm_xor_si128(a,b)
deoxys_8.c:                                       ^
deoxys_8.c: deoxys_8.c:159:5: note: variable 'CHECKSUM' is declared here
deoxys_8.c:     __m128i CHECKSUM;
deoxys_8.c:     ^
deoxys_8.c: deoxys_8.c:174:14: warning: variable 'AUTH' is uninitialized when used here [-Wuninitialized]
deoxys_8.c:     AUTH=XOR(AUTH,AUTH);
deoxys_8.c:              ^~~~
deoxys_8.c: ./deoxys.macros:38:39: note: expanded from macro 'XOR'
deoxys_8.c: #define XOR(a,b)        _mm_xor_si128(a,b)
deoxys_8.c:                                       ^
deoxys_8.c: deoxys_8.c:158:5: note: variable 'AUTH' is declared here
deoxys_8.c:     __m128i AUTH;
deoxys_8.c:     ^
deoxys_8.c: deoxys_8.c:460:18: warning: variable 'CHECKSUM' is uninitialized when used here [-Wuninitialized]
deoxys_8.c:     CHECKSUM=XOR(CHECKSUM,CHECKSUM);
deoxys_8.c:                  ^~~~~~~~
deoxys_8.c: ./deoxys.macros:38:39: note: expanded from macro 'XOR'
deoxys_8.c: #define XOR(a,b)        _mm_xor_si128(a,b)
deoxys_8.c:                                       ^
deoxys_8.c: deoxys_8.c:366:5: note: variable 'CHECKSUM' is declared here
deoxys_8.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:bitsliceclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:bitsliceclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:bitsliceclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)
T:bitsliceclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)

Compiler output


deoxys_8.c: deoxys_8.c:248:18: warning: variable 'CHECKSUM' is uninitialized when used here [-Wuninitialized]
deoxys_8.c:     CHECKSUM=XOR(CHECKSUM,CHECKSUM);
deoxys_8.c:                  ^~~~~~~~
deoxys_8.c: ./deoxys.macros:38:39: note: expanded from macro 'XOR'
deoxys_8.c: #define XOR(a,b)        _mm_xor_si128(a,b)
deoxys_8.c:                                       ^
deoxys_8.c: deoxys_8.c:159:5: note: variable 'CHECKSUM' is declared here
deoxys_8.c:     __m128i CHECKSUM;
deoxys_8.c:     ^
deoxys_8.c: deoxys_8.c:174:14: warning: variable 'AUTH' is uninitialized when used here [-Wuninitialized]
deoxys_8.c:     AUTH=XOR(AUTH,AUTH);
deoxys_8.c:              ^~~~
deoxys_8.c: ./deoxys.macros:38:39: note: expanded from macro 'XOR'
deoxys_8.c: #define XOR(a,b)        _mm_xor_si128(a,b)
deoxys_8.c:                                       ^
deoxys_8.c: deoxys_8.c:158:5: note: variable 'AUTH' is declared here
deoxys_8.c:     __m128i AUTH;
deoxys_8.c:     ^
deoxys_8.c: deoxys_8.c:178:5: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'deoxys_aead_encrypt_8' that is compiled without support for 'ssse3'
deoxys_8.c:     KEY_SCHEDULE(key, subkey);
deoxys_8.c:     ^
deoxys_8.c: ./deoxys.macros:76:5: note: expanded from macro 'KEY_SCHEDULE'
deoxys_8.c:     packing(subkey[0]);\
deoxys_8.c:     ^
deoxys_8.c: ./deoxys.macros:401:14: note: expanded from macro 'packing'
deoxys_8.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:bitsliceclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1)