Implementation notes: amd64, kizomba, crypto_decode/1013x2393

Computer: kizomba
Microarchitecture: amd64; Kaby Lake (906e9)
Architecture: amd64
CPU ID: GenuineIntel-000906e9-1fc9cbf5
SUPERCOP version: 20240716
Operation: crypto_decode
Primitive: 1013x2393
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
15102770 0 013190 808 792avxclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
15143615 0 013220 816 728avxclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
15183937 0 014646 776 800avxgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
15474779 0 018076 816 744avxclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
15514779 0 018372 816 792avxclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
17013947 0 014277 768 800avxgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
17265187 0 018038 776 800avxgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
17363893 0 013233 752 768avxgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
895511520 0 024868 816 744portableclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
895812209 0 025852 816 792portableclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
129042984 0 014940 816 728int16clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
129692939 0 016260 816 744int16clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
130222939 0 016556 816 792int16clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
131702223 0 012662 808 792int16clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
134252416 0 012012 816 728int16clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
158044322 0 017166 776 800portablegcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
159253140 0 013814 776 800portablegcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
170312631 0 012933 768 800int16gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
170786237 0 018212 816 728portableclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
171742542 0 013214 776 800int16gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
177522511 0 011857 752 768int16gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
200183019 0 013454 808 792portableclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
208453208 0 012804 816 728portableclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
216393483 0 016318 776 800int16gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
393831810 0 014590 776 800refgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
397722358 0 014316 816 728refclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
397843344 0 016956 816 792refclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
404802286 0 015604 816 744refclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
414101633 0 012246 776 800refgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
428671972 0 012285 768 800portablegcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
430381897 0 011265 752 768portablegcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
457341165 0 010764 816 728refclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
471591899 0 012270 808 792refclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
488771648 0 011861 768 800refgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716
553331523 0 010753 752 768refgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071620240716

Compiler output


decode.c: decode.c:40:23: warning: unused function 'addconst' [-Wunused-function]
decode.c: static inline __m256i addconst(__m256i x,int16 y)
decode.c:                       ^
decode.c: 1 warning generated.

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
avxclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
avxclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
avxclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
avxclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


decode.c: decode.c:273:15: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'crypto_decode_1013x2393_avx_constbranchindex' that is compiled without support for 'avx'
decode.c:     A2 = A0 = _mm256_loadu_si256((__m256i *) &R5[i]);
decode.c:               ^
decode.c: decode.c:273:15: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
decode.c: decode.c:274:10: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'crypto_decode_1013x2393_avx_constbranchindex' that is compiled without support for 'avx'
decode.c:     S0 = _mm256_loadu_si256((__m256i *) (s+2*i));
decode.c:          ^
decode.c: decode.c:274:10: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
decode.c: decode.c:275:10: error: always_inline function '_mm256_srli_epi16' requires target feature 'avx2', but would be inlined into function 'crypto_decode_1013x2393_avx_constbranchindex' that is compiled without support for 'avx2'
decode.c:     S1 = _mm256_srli_epi16(S0,8);
decode.c:          ^
decode.c: decode.c:275:10: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
decode.c: decode.c:276:11: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'crypto_decode_1013x2393_avx_constbranchindex' that is compiled without support for 'avx'
decode.c:     S0 &= _mm256_set1_epi16(255);
decode.c:           ^
decode.c: decode.c:276:11: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
decode.c: decode.c:277:14: warning: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI [-Wpsabi]
decode.c:     A0 = sub(mulhiconst(A0,412),mulhiconst(mulloconst(A0,-5118),3278)); /* -1639...1742 */
decode.c:              ^
decode.c: decode.c:277:44: warning: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI [-Wpsabi]
decode.c:     A0 = sub(mulhiconst(A0,412),mulhiconst(mulloconst(A0,-5118),3278)); /* -1639...1742 */
decode.c:                                            ^
decode.c: decode.c:277:33: warning: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI [-Wpsabi]
decode.c:     A0 = sub(mulhiconst(A0,412),mulhiconst(mulloconst(A0,-5118),3278)); /* -1639...1742 */
decode.c:                                 ^
decode.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
avxclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


decode.c: decode.c:23:21: warning: unused variable 'hi' [-Wunused-variable]
decode.c:   int16 a0,a1,ri,lo,hi,s0,s1;
decode.c:                     ^
decode.c: 1 warning generated.

Number of similar (implementation,compiler) pairs: 5, namely:
ImplementationCompiler
int16clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
int16clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
int16clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
int16clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
int16clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


decode.c: decode.c: In function 'crypto_decode_1013x2393_int16_constbranchindex':
decode.c: decode.c:23:21: warning: unused variable 'hi' [-Wunused-variable]
decode.c:    23 |   int16 a0,a1,ri,lo,hi,s0,s1;
decode.c:       |                     ^~

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
int16gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
int16gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
int16gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
int16gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Passed TIMECOP


TIMECOP iterations: 10

Number of similar (implementation,compiler) pairs: 35, namely:
ImplementationCompiler
avxclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
avxclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
avxclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
avxclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
avxgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
avxgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
avxgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
avxgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
int16clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
int16clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
int16clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
int16clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
int16clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
int16gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
int16gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
int16gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
int16gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
portableclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
portableclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
portableclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
portableclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
portableclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
portablegcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
portablegcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
portablegcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
portablegcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
refclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
refclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
refclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
refclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
refclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
refgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
refgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
refgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
refgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)