Implementation notes: amd64, hertz, crypto_stream/aes256ctr

Computer: hertz
Microarchitecture: amd64; Zen 4 (a60f12)
Architecture: amd64
CPU ID: AuthenticAMD-00a60f12-178bfbff
SUPERCOP version: 20240716
Operation: crypto_stream
Primitive: aes256ctr
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
4445255 0 025263 828 1032dolbeau/vaesenc-intclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
4465255 0 025151 828 1032dolbeau/vaesenc-intclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
5157026 0 023237 804 1096dolbeau/vaesenc-intgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
8034663 0 020869 804 1096dolbeau/aesenc-intgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
172524142 2800 01777397 145356 10504T:cryptoppg++_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
172613850 2064 01775483 145412 10440T:cryptoppclang++_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
173211789 2064 01773291 145412 10440T:cryptoppclang++_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
173723883 2800 01774949 145356 10440T:cryptoppg++_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
17959872 3272 01763653 146420 10440T:cryptoppg++_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
18189210 1480 01764880 145404 10440T:cryptoppclang++_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
2312424 0 021070 884 1032T:opensslclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
2324419 0 014799 876 1032T:opensslclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
2333605 0 015594 884 1000T:opensslgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
2335739 0 017826 884 1032T:opensslgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
2344424 0 020958 884 1032T:opensslclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
26283746 0 017901 804 1032dolbeau/vaesenc-intgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
3062571 0 013869 860 1000T:opensslgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
31222078 0 016213 804 1032dolbeau/aesenc-intgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
36443492 0 015936 780 1000dolbeau/vaesenc-intgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
41001929 0 014376 780 1000dolbeau/aesenc-intgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716

Compiler output


aesenc-int.c: aesenc-int.c:81:20: warning: unused function 'aesni_encrypt1' [-Wunused-function]
aesenc-int.c:    81 | static inline void aesni_encrypt1(unsigned char *out, unsigned char *n, __m128i rkeys[16]) {
aesenc-int.c:       |                    ^~~~~~~~~~~~~~
aesenc-int.c: aesenc-int.c:97:20: warning: unused function 'incle' [-Wunused-function]
aesenc-int.c:    97 | static inline void incle(unsigned char n[16]) {
aesenc-int.c:       |                    ^~~~~
aesenc-int.c: aesenc-int.c:195:1: warning: unused function 'aesni_encrypt4' [-Wunused-function]
aesenc-int.c:   195 | FUNC(4, MAKE4)
aesenc-int.c:       | ^~~~~~~~~~~~~~
aesenc-int.c: aesenc-int.c:172:22: note: expanded from macro 'FUNC'
aesenc-int.c:   172 |   static inline void aesni_encrypt##N(unsigned char *out, unsigned char *n, __m128i rkeys[16]) { \
aesenc-int.c:       |                      ^~~~~~~~~~~~~~~~
aesenc-int.c: <scratch space>:204:1: note: expanded from here
aesenc-int.c:   204 | aesni_encrypt4
aesenc-int.c:       | ^~~~~~~~~~~~~~
aesenc-int.c: aesenc-int.c:196:1: warning: unused function 'aesni_encrypt6' [-Wunused-function]
aesenc-int.c:   196 | FUNC(6, MAKE6)
aesenc-int.c:       | ^~~~~~~~~~~~~~
aesenc-int.c: aesenc-int.c:172:22: note: expanded from macro 'FUNC'
aesenc-int.c:   172 |   static inline void aesni_encrypt##N(unsigned char *out, unsigned char *n, __m128i rkeys[16]) { \
aesenc-int.c:       |                      ^~~~~~~~~~~~~~~~
aesenc-int.c: <scratch space>:253:1: note: expanded from here
aesenc-int.c:   253 | aesni_encrypt6
aesenc-int.c:       | ^~~~~~~~~~~~~~
aesenc-int.c: aesenc-int.c:197:1: warning: unused function 'aesni_encrypt7' [-Wunused-function]
aesenc-int.c: ...

Number of similar (implementation,compiler) pairs: 3, namely:
ImplementationCompiler
dolbeau/aesenc-intclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
dolbeau/aesenc-intclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
dolbeau/aesenc-intclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))

Compiler output


aesenc-int.c: aesenc-int.c:23: warning: "_bswap64" redefined
aesenc-int.c:    23 | #define _bswap64(a) __builtin_bswap64(a)
aesenc-int.c:       |
aesenc-int.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/13/include/x86gprintrin.h:33,
aesenc-int.c:                  from /usr/lib/gcc/x86_64-linux-gnu/13/include/immintrin.h:27,
aesenc-int.c:                  from aesenc-int.c:12:
aesenc-int.c: /usr/lib/gcc/x86_64-linux-gnu/13/include/ia32intrin.h:273: note: this is the location of the previous definition
aesenc-int.c:   273 | #define _bswap64(a)             __bswapq(a)
aesenc-int.c:       |
aesenc-int.c: aesenc-int.c:24: warning: "_bswap" redefined
aesenc-int.c:    24 | #define _bswap(a) __builtin_bswap(a)
aesenc-int.c:       |
aesenc-int.c: /usr/lib/gcc/x86_64-linux-gnu/13/include/ia32intrin.h:307: note: this is the location of the previous definition
aesenc-int.c:   307 | #define _bswap(a)               __bswapd(a)
aesenc-int.c:       |
aesenc-int.c: aesenc-int.c: In function 'aesni_encrypt1':
aesenc-int.c: aesenc-int.c:85: warning: ignoring '#pragma unroll ' [-Wunknown-pragmas]
aesenc-int.c:    85 | #pragma unroll(13)
aesenc-int.c:       |

Number of similar (implementation,compiler) pairs: 3, namely:
ImplementationCompiler
dolbeau/aesenc-intgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
dolbeau/aesenc-intgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
dolbeau/aesenc-intgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)

Compiler output


vaesenc-int.c: vaesenc-int.c:119:20: warning: unused function 'aesni_encrypt1' [-Wunused-function]
vaesenc-int.c:   119 | static inline void aesni_encrypt1(unsigned char *out, unsigned char *n, __mAESi rkeys[16]) {
vaesenc-int.c:       |                    ^~~~~~~~~~~~~~
vaesenc-int.c: vaesenc-int.c:135:20: warning: unused function 'incle' [-Wunused-function]
vaesenc-int.c:   135 | static inline void incle(unsigned char n[16]) {
vaesenc-int.c:       |                    ^~~~~
vaesenc-int.c: vaesenc-int.c:279:1: warning: unused function 'aesni_encrypt2' [-Wunused-function]
vaesenc-int.c:   279 | FUNC(2, MAKE2)
vaesenc-int.c:       | ^~~~~~~~~~~~~~
vaesenc-int.c: vaesenc-int.c:256:22: note: expanded from macro 'FUNC'
vaesenc-int.c:   256 |   static inline void aesni_encrypt##N(unsigned char *out, unsigned int *n, const __mAESi rkeys[15]) { \
vaesenc-int.c:       |                      ^~~~~~~~~~~~~~~~
vaesenc-int.c: <scratch space>:204:1: note: expanded from here
vaesenc-int.c:   204 | aesni_encrypt2
vaesenc-int.c:       | ^~~~~~~~~~~~~~
vaesenc-int.c: vaesenc-int.c:280:1: warning: unused function 'aesni_encrypt4' [-Wunused-function]
vaesenc-int.c:   280 | FUNC(4, MAKE4)
vaesenc-int.c:       | ^~~~~~~~~~~~~~
vaesenc-int.c: vaesenc-int.c:256:22: note: expanded from macro 'FUNC'
vaesenc-int.c:   256 |   static inline void aesni_encrypt##N(unsigned char *out, unsigned int *n, const __mAESi rkeys[15]) { \
vaesenc-int.c:       |                      ^~~~~~~~~~~~~~~~
vaesenc-int.c: <scratch space>:229:1: note: expanded from here
vaesenc-int.c:   229 | aesni_encrypt4
vaesenc-int.c:       | ^~~~~~~~~~~~~~
vaesenc-int.c: 4 warnings generated.

Number of similar (implementation,compiler) pairs: 3, namely:
ImplementationCompiler
dolbeau/vaesenc-intclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
dolbeau/vaesenc-intclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))
dolbeau/vaesenc-intclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))

Compiler output


vaesenc-int.c: vaesenc-int.c: In function 'aesni_encrypt1':
vaesenc-int.c: vaesenc-int.c:123: warning: ignoring '#pragma unroll ' [-Wunknown-pragmas]
vaesenc-int.c:   123 | #pragma unroll(13)
vaesenc-int.c:       |

Number of similar (implementation,compiler) pairs: 2, namely:
ImplementationCompiler
dolbeau/vaesenc-intgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
dolbeau/vaesenc-intgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)

Compiler output


vaesenc-int.c: vaesenc-int.c: In function 'aesni_encrypt1':
vaesenc-int.c: vaesenc-int.c:123: warning: ignoring '#pragma unroll ' [-Wunknown-pragmas]
vaesenc-int.c:   123 | #pragma unroll(13)
vaesenc-int.c:       |
vaesenc-int.c: In function 'aesni_encrypt8',
vaesenc-int.c:     inlined from 'crypto_stream_aes256ctr_dolbeau_vaesenc_int_constbranchindex' at vaesenc-int.c:313:3:
vaesenc-int.c: vaesenc-int.c:258:15: warning: array subscript 4 is outside array bounds of 'unsigned char[16]' [-Warray-bounds=]
vaesenc-int.c:   258 |     long long nl = *(long long*)&n[8];                                  \
vaesenc-int.c:       |               ^~
vaesenc-int.c: vaesenc-int.c:281:1: note: in expansion of macro 'FUNC'
vaesenc-int.c:   281 | FUNC(8, MAKE8)
vaesenc-int.c:       | ^~~~
vaesenc-int.c: vaesenc-int.c: In function 'crypto_stream_aes256ctr_dolbeau_vaesenc_int_constbranchindex':
vaesenc-int.c: vaesenc-int.c:293:25: note: at offset 32 into object 'n2' of size 16
vaesenc-int.c:   293 |   ALIGN16 unsigned char n2[16];
vaesenc-int.c:       |                         ^~
vaesenc-int.c: In function 'aesni_encrypt8',
vaesenc-int.c:     inlined from 'crypto_stream_aes256ctr_dolbeau_vaesenc_int_constbranchindex_xor' at vaesenc-int.c:347:3:
vaesenc-int.c: vaesenc-int.c:258:15: warning: array subscript 4 is outside array bounds of 'unsigned char[16]' [-Warray-bounds=]
vaesenc-int.c:   258 |     long long nl = *(long long*)&n[8];                                  \
vaesenc-int.c:       |               ^~
vaesenc-int.c: vaesenc-int.c:281:1: note: in expansion of macro 'FUNC'
vaesenc-int.c:   281 | FUNC(8, MAKE8)
vaesenc-int.c:       | ^~~~
vaesenc-int.c: vaesenc-int.c: In function 'crypto_stream_aes256ctr_dolbeau_vaesenc_int_constbranchindex_xor':
vaesenc-int.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
dolbeau/vaesenc-intgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)

TIMECOP error (can be valgrind bug)


error 111

Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x10A2D7
   at 0x...: core (try-anything.c:68)
   by 0x...: salsa20.part.0 (try-anything.c:101)
   by 0x...: salsa20 (try-anything.c:85)
   by 0x...: testvector (try-anything.c:124)
   by 0x...: myrandom (try-anything.c:132)
   by 0x...: test (try.c:114)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 2, namely:
ImplementationCompiler
dolbeau/aesenc-intgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
dolbeau/vaesenc-intgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)

TIMECOP error (can be valgrind bug)


error 111

Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x10A36C
   at 0x...: core (try-anything.c:64)
   by 0x...: salsa20 (try-anything.c:101)
   by 0x...: salsa20 (try-anything.c:81)
   by 0x...: testvector (try-anything.c:124)
   by 0x...: myrandom (try-anything.c:132)
   by 0x...: test (try.c:114)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 2, namely:
ImplementationCompiler
dolbeau/aesenc-intgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
dolbeau/vaesenc-intgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)

TIMECOP error (can be valgrind bug)


error 111

Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x10A0A7
   at 0x...: core (try-anything.c:64)
   by 0x...: salsa20.part.0 (try-anything.c:101)
   by 0x...: salsa20 (try-anything.c:85)
   by 0x...: testvector (try-anything.c:124)
   by 0x...: myrandom (try-anything.c:132)
   by 0x...: test (try.c:114)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 2, namely:
ImplementationCompiler
dolbeau/aesenc-intgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)
dolbeau/vaesenc-intgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.2.0)

TIMECOP error (can be valgrind bug)


error 111

Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x10AFB6
   at 0x...: core (try-anything.c:61)
   by 0x...: salsa20 (try-anything.c:101)
   by 0x...: testvector (try-anything.c:124)
   by 0x...: myrandom (try-anything.c:132)
   by 0x...: test (try.c:114)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
dolbeau/vaesenc-intclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))

TIMECOP error (can be valgrind bug)


error 111

Process terminating with default action of signal 4 (SIGILL)
 Illegal opcode at address 0x10BAD4
   at 0x...: salsa20 (try-anything.c:90)
   by 0x...: canary (try-anything.c:148)
   by 0x...: output_prepare (try-anything.c:178)
   by 0x...: test (try.c:118)
   by 0x...: main (try-anything.c:345)

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
dolbeau/vaesenc-intclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_18.1.3_(1ubuntu1))