Implementation notes: amd64, panther, crypto_decode/761x4591
Computer: panther
Microarchitecture: amd64; Tiger Lake (806c1)
Architecture: amd64
CPU ID: GenuineIntel-000806c1-00-bfebfbff
SUPERCOP version: 20240625
Operation: crypto_decode
Primitive: 761x4591
Time | Object size | Test size | Implementation | Compiler | Benchmark date | SUPERCOP version |
1002 | 4320 0 0 | 17245 828 920 | avx | clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
1007 | 5024 0 0 | 17965 828 920 | avx | clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
1011 | 2719 0 0 | 12547 820 888 | avx | clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
1069 | 3832 0 0 | 14152 780 952 | avx | gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
1090 | 3860 0 0 | 13783 772 952 | avx | gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
1228 | 5352 0 0 | 17744 780 952 | avx | gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
1296 | 3849 0 0 | 12843 756 920 | avx | gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
4111 | 4102 0 0 | 14171 820 888 | avx | clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
6335 | 9283 0 0 | 22253 828 920 | portable | clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
6353 | 7323 0 0 | 20293 828 920 | portable | clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
9568 | 2865 0 0 | 15781 828 920 | int16 | clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
9568 | 2881 0 0 | 15189 828 888 | int16 | clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
9575 | 2865 0 0 | 15797 828 920 | int16 | clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
9584 | 2267 0 0 | 12075 820 888 | int16 | clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
11891 | 2945 0 0 | 15504 780 952 | int16 | gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
11943 | 2311 0 0 | 12760 780 952 | int16 | gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
12224 | 2913 0 0 | 12715 820 888 | portable | clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
12323 | 3814 0 0 | 16360 780 952 | portable | gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
12451 | 2966 0 0 | 13408 780 952 | portable | gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
12591 | 3950 0 0 | 16269 828 888 | portable | clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
13287 | 2355 0 0 | 12399 772 952 | int16 | gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
16790 | 3257 0 0 | 13243 820 888 | int16 | clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
17171 | 2299 0 0 | 11411 756 920 | int16 | gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
20529 | 1413 0 0 | 13968 780 952 | ref | gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
21900 | 3422 0 0 | 16381 828 920 | ref | clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
21902 | 2416 0 0 | 15357 828 920 | ref | clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
21962 | 2149 0 0 | 14477 828 888 | ref | clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
22453 | 1242 0 0 | 11696 780 952 | ref | gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
22515 | 1618 0 0 | 10779 756 920 | portable | gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
22704 | 1484 0 0 | 11259 820 888 | ref | clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
22967 | 1850 0 0 | 11851 820 888 | portable | clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
23529 | 1801 0 0 | 11871 772 952 | portable | gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
28363 | 1116 0 0 | 11091 820 888 | ref | clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
28477 | 1176 0 0 | 11207 772 952 | ref | gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
28998 | 1048 0 0 | 10155 756 920 | ref | gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240625 | 20240625 |
Compiler output
decode.c: decode.c:40:23: warning: unused function 'addconst' [-Wunused-function]
decode.c: static inline __m256i addconst(__m256i x,int16 y)
decode.c: ^
decode.c: 1 warning generated.
Number of similar (implementation,compiler) pairs: 4, namely:
Implementation | Compiler |
avx | clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
avx | clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
avx | clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
avx | clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
Compiler output
decode.c: decode.c:259:15: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'crypto_decode_761x4591_avx_constbranchindex' that is compiled without support for 'avx'
decode.c: A2 = A0 = _mm256_loadu_si256((__m256i *) &R5[i]);
decode.c: ^
decode.c: decode.c:259:15: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
decode.c: decode.c:260:10: error: always_inline function '_mm256_cvtepu8_epi16' requires target feature 'avx2', but would be inlined into function 'crypto_decode_761x4591_avx_constbranchindex' that is compiled without support for 'avx2'
decode.c: S0 = _mm256_cvtepu8_epi16(_mm_loadu_si128((__m128i *) (s+i)));
decode.c: ^
decode.c: decode.c:260:10: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
decode.c: decode.c:261:14: warning: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI [-Wpsabi]
decode.c: A0 = sub(mulhiconst(A0,-134),mulhiconst(mulloconst(A0,-10350),1621)); /* -844...810 */
decode.c: ^
decode.c: decode.c:261:45: warning: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI [-Wpsabi]
decode.c: A0 = sub(mulhiconst(A0,-134),mulhiconst(mulloconst(A0,-10350),1621)); /* -844...810 */
decode.c: ^
decode.c: decode.c:261:34: warning: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI [-Wpsabi]
decode.c: A0 = sub(mulhiconst(A0,-134),mulhiconst(mulloconst(A0,-10350),1621)); /* -844...810 */
decode.c: ^
decode.c: decode.c:261:10: warning: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI [-Wpsabi]
decode.c: A0 = sub(mulhiconst(A0,-134),mulhiconst(mulloconst(A0,-10350),1621)); /* -844...810 */
decode.c: ^
decode.c: decode.c:262:10: warning: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI [-Wpsabi]
decode.c: A0 = add(A0,S0); /* -844...1065 */
decode.c: ^
decode.c: decode.c:263:10: warning: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI [-Wpsabi]
decode.c: A0 = ifnegaddconst(A0,1621); /* 0...1620 */
decode.c: ...
Number of similar (implementation,compiler) pairs: 1, namely:
Implementation | Compiler |
avx | clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
Compiler output
decode.c: decode.c:23:21: warning: unused variable 'hi' [-Wunused-variable]
decode.c: int16 a0,a1,ri,lo,hi,s0,s1;
decode.c: ^
decode.c: 1 warning generated.
Number of similar (implementation,compiler) pairs: 5, namely:
Implementation | Compiler |
int16 | clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
int16 | clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
int16 | clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
int16 | clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
int16 | clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
Compiler output
decode.c: decode.c: In function 'crypto_decode_761x4591_int16_constbranchindex':
decode.c: decode.c:23:21: warning: unused variable 'hi' [-Wunused-variable]
decode.c: 23 | int16 a0,a1,ri,lo,hi,s0,s1;
decode.c: | ^~
Number of similar (implementation,compiler) pairs: 4, namely:
Implementation | Compiler |
int16 | gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
int16 | gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
int16 | gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
int16 | gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
TIMECOP error (can be valgrind bug)
Process terminating with default action of signal 4 (SIGILL)
Illegal opcode at address 0x401ED0
at 0x...: st32 (try-anything.c:47)
by 0x...: core (try-anything.c:78)
by 0x...: salsa20 (try-anything.c:89)
by 0x...: testvector (try-anything.c:124)
by 0x...: input_prepare (try-anything.c:162)
by 0x...: test (try.c:95)
by 0x...: main (try-anything.c:345)
Number of similar (implementation,compiler) pairs: 4, namely:
Implementation | Compiler |
avx | clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
int16 | clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
portable | clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
ref | clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
TIMECOP error (can be valgrind bug)
Process terminating with default action of signal 4 (SIGILL)
Illegal opcode at address 0x403018
at 0x...: sub (decode.c:27)
by 0x...: crypto_decode_761x4591_avx_constbranchindex (decode.c:264)
by 0x...: test (try.c:99)
by 0x...: main (try-anything.c:345)
Number of similar (implementation,compiler) pairs: 1, namely:
Implementation | Compiler |
avx | clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
TIMECOP error (can be valgrind bug)
Process terminating with default action of signal 4 (SIGILL)
Illegal opcode at address 0x402D10
at 0x...: mulhiconst (decode.c:57)
by 0x...: crypto_decode_761x4591_avx_constbranchindex (decode.c:261)
by 0x...: test (try.c:99)
by 0x...: main (try-anything.c:345)
Number of similar (implementation,compiler) pairs: 1, namely:
Implementation | Compiler |
avx | clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
TIMECOP error (can be valgrind bug)
Process terminating with default action of signal 4 (SIGILL)
Illegal opcode at address 0x40191A
at 0x...: core (try-anything.c:73)
by 0x...: salsa20 (try-anything.c:89)
by 0x...: testvector (try-anything.c:124)
by 0x...: input_prepare (try-anything.c:162)
by 0x...: test (try.c:95)
by 0x...: main (try-anything.c:345)
Number of similar (implementation,compiler) pairs: 4, namely:
Implementation | Compiler |
avx | clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
int16 | clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
portable | clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
ref | clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
TIMECOP error (can be valgrind bug)
Process terminating with default action of signal 4 (SIGILL)
Illegal opcode at address 0x109AC9
at 0x...: salsa20.part.0 (try-anything.c:102)
by 0x...: salsa20 (try-anything.c:85)
by 0x...: canary (try-anything.c:148)
by 0x...: input_prepare (try-anything.c:163)
by 0x...: test (try.c:95)
by 0x...: main (try-anything.c:345)
Number of similar (implementation,compiler) pairs: 4, namely:
Implementation | Compiler |
avx | gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
int16 | gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
portable | gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
ref | gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
TIMECOP error (can be valgrind bug)
Process terminating with default action of signal 4 (SIGILL)
Illegal opcode at address 0x10996A
at 0x...: st32 (try-anything.c:47)
by 0x...: core (try-anything.c:78)
by 0x...: salsa20.part.0 (try-anything.c:89)
by 0x...: salsa20 (try-anything.c:85)
by 0x...: testvector (try-anything.c:124)
by 0x...: input_prepare (try-anything.c:162)
by 0x...: test (try.c:95)
by 0x...: main (try-anything.c:345)
Number of similar (implementation,compiler) pairs: 4, namely:
Implementation | Compiler |
avx | gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
int16 | gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
portable | gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
ref | gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
TIMECOP error (can be valgrind bug)
Process terminating with default action of signal 4 (SIGILL)
Illegal opcode at address 0x10A56B
at 0x...: _mm256_cmpgt_epi16 (avx2intrin.h:268)
by 0x...: ifgesubconst (decode.c:64)
by 0x...: crypto_decode_761x4591_avx_constbranchindex (decode.c:268)
by 0x...: test (try.c:99)
by 0x...: main (try-anything.c:345)
Number of similar (implementation,compiler) pairs: 1, namely:
Implementation | Compiler |
avx | gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
TIMECOP error (can be valgrind bug)
Process terminating with default action of signal 4 (SIGILL)
Illegal opcode at address 0x10A2B4
at 0x...: _mm256_slli_epi16 (avx2intrin.h:670)
by 0x...: shiftleftconst (decode.c:32)
by 0x...: crypto_decode_761x4591_avx_constbranchindex (decode.c:264)
by 0x...: test (try.c:99)
by 0x...: main (try-anything.c:345)
Number of similar (implementation,compiler) pairs: 1, namely:
Implementation | Compiler |
avx | gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
TIMECOP error (can be valgrind bug)
Process terminating with default action of signal 4 (SIGILL)
Illegal opcode at address 0x402FC3
at 0x...: uint32_divmod_uint14 (decode.c:59)
by 0x...: crypto_decode_761x4591_portable_constbranchindex (decode.c:130)
by 0x...: test (try.c:99)
by 0x...: main (try-anything.c:345)
Number of similar (implementation,compiler) pairs: 1, namely:
Implementation | Compiler |
portable | clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
Passed TIMECOP
TIMECOP iterations: 10
Number of similar (implementation,compiler) pairs: 14, namely:
Implementation | Compiler |
int16 | clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
int16 | clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
int16 | clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
int16 | gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
int16 | gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
portable | clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
portable | clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
portable | gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
portable | gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
ref | clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
ref | clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
ref | clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_11.0.1) |
ref | gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |
ref | gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (10.2.1_20210110) |