Implementation notes: amd64, nucnuc, crypto_decode/1013x7177

Computer: nucnuc
Microarchitecture: amd64; Airmont (406c3)
Architecture: amd64
CPU ID: GenuineIntel-000406c3-bfebfbff
SUPERCOP version: 20240107
Operation: crypto_decode
Primitive: 1013x7177
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
302613577 0 016312 812 888int16clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231217
303042683 0 013776 812 888int16clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231217
306302345 0 011926 804 888int16clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231217
316053595 0 015920 812 888int16clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231217
327833878 0 015576 780 952int16gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231217
327982476 0 013000 780 952int16gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231217
339212532 0 012639 772 952int16gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231217
339342477 0 011643 756 920int16gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231217
444364449 0 017184 812 888portableclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231217
447682975 0 012558 804 888portableclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231217
450916241 0 017920 780 952portablegcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231217
451803425 0 014512 812 888portableclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231217
457534646 0 016968 812 888portableclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231217
460773113 0 013648 780 952portablegcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231217
522863593 0 013694 804 888int16clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231217
1383951361 0 013056 780 952refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231217
1398392383 0 014720 812 888refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231217
1427861249 0 011776 780 952refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231217
1436881360 0 012456 812 888refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231217
1437642308 0 015048 812 888refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231217
1448721661 0 010843 756 920portablegcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231217
1470841805 0 011919 772 952portablegcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231217
1487591898 0 011998 804 888portableclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231217
1500151100 0 010654 804 888refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231217
1602361176 0 011271 772 952refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231217
1679691056 0 010171 756 920refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231217
1681711131 0 011206 804 888refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231217

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
decode.c: decode.c:276:15: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'crypto_decode_1013x7177_avx_constbranchindex' that is compiled without support for 'avx'
decode.c: A2 = A0 = _mm256_loadu_si256((__m256i *) &R5[i]);
decode.c: ^
decode.c: decode.c:276:15: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
decode.c: decode.c:277:10: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'crypto_decode_1013x7177_avx_constbranchindex' that is compiled without support for 'avx'
decode.c: S0 = _mm256_loadu_si256((__m256i *) (s+2*i));
decode.c: ^
decode.c: decode.c:277:10: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
decode.c: decode.c:278:10: error: always_inline function '_mm256_srli_epi16' requires target feature 'avx2', but would be inlined into function 'crypto_decode_1013x7177_avx_constbranchindex' that is compiled without support for 'avx2'
decode.c: S1 = _mm256_srli_epi16(S0,8);
decode.c: ^
decode.c: decode.c:278:10: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
decode.c: decode.c:279:11: error: always_inline function '_mm256_set1_epi16' requires target feature 'avx', but would be inlined into function 'crypto_decode_1013x7177_avx_constbranchindex' that is compiled without support for 'avx'
decode.c: S0 &= _mm256_set1_epi16(255);
decode.c: ^
decode.c: decode.c:279:11: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
decode.c: decode.c:280:14: warning: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI [-Wpsabi]
decode.c: A0 = sub(mulhiconst(A0,538),mulhiconst(mulloconst(A0,-2118),7921)); /* -3961...4095 */
decode.c: ^
decode.c: decode.c:280:44: warning: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI [-Wpsabi]
decode.c: A0 = sub(mulhiconst(A0,538),mulhiconst(mulloconst(A0,-2118),7921)); /* -3961...4095 */
decode.c: ^
decode.c: decode.c:280:33: warning: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI [-Wpsabi]
decode.c: A0 = sub(mulhiconst(A0,538),mulhiconst(mulloconst(A0,-2118),7921)); /* -3961...4095 */
decode.c: ^
decode.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
decode.c: decode.c: In function 'add':
decode.c: decode.c:21:1: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
decode.c: 21 | {
decode.c: | ^
decode.c: decode.c: In function 'signedshiftrightconst':
decode.c: decode.c:35:23: note: the ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
decode.c: 35 | static inline __m256i signedshiftrightconst(__m256i x,int16 y)
decode.c: | ^~~~~~~~~~~~~~~~~~~~~
decode.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:53,
decode.c: from decode.c:3:
decode.c: decode.c: In function 'add':
decode.c: /usr/lib/gcc/x86_64-linux-gnu/10/include/avx2intrin.h:112:1: error: inlining failed in call to 'always_inline' '_mm256_add_epi16': target specific option mismatch
decode.c: 112 | _mm256_add_epi16 (__m256i __A, __m256i __B)
decode.c: | ^~~~~~~~~~~~~~~~
decode.c: decode.c:22:10: note: called from here
decode.c: 22 | return _mm256_add_epi16(x,y);
decode.c: | ^~~~~~~~~~~~~~~~~~~~~

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE avx
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE avx