Implementation notes: amd64, comet, crypto_decode/256x2

Computer: comet
Microarchitecture: amd64; Comet Lake (806ec)
Architecture: amd64
CPU ID: GenuineIntel-000806ec-bfebfbff
SUPERCOP version: 20240107
Operation: crypto_decode
Primitive: 256x2
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
64370 0 012540 780 960avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010420231222
68400 0 014193 852 928avxclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010420231222
68400 0 014937 852 960avxclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010420231222
68176 0 010059 772 960avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010420231222
69186 0 010444 780 960avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010420231222
69173 0 09071 756 928avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010420231222
70170 0 011063 844 960avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010420231222
72174 0 010265 852 896avxclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010420231222
624162 0 013937 852 928refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010420231222
685162 0 014681 852 960refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010420231222
712122 0 013001 852 896refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010420231222
77593 0 09947 772 960refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010420231222
77696 0 010348 780 960refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010420231222
83496 0 012236 780 960refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010420231222
84583 0 010967 844 960refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010420231222
108890 0 08959 756 928refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010420231222
1343101 0 010185 852 896refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010420231222

Compiler output

Implementation: avx
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
decode.c: decode.c:16:17: error: always_inline function '_mm256_set1_epi32' requires target feature 'avx', but would be inlined into function 'crypto_decode_256x2_avx_constbranchindex' that is compiled without support for 'avx'
decode.c: __m256i x = _mm256_set1_epi32(*(int32_t *) s);
decode.c: ^
decode.c: decode.c:16:17: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
decode.c: decode.c:18:31: error: always_inline function '_mm256_set_epi64x' requires target feature 'avx', but would be inlined into function 'crypto_decode_256x2_avx_constbranchindex' that is compiled without support for 'avx'
decode.c: x = _mm256_shuffle_epi8(x,COPY);
decode.c: ^
decode.c: decode.c:5:14: note: expanded from macro 'COPY'
decode.c: #define COPY _mm256_set_epi64x(0x0303030303030303,0x0202020202020202,0x0101010101010101,0x0000000000000000)
decode.c: ^
decode.c: decode.c:18:31: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
decode.c: decode.c:5:14: note: expanded from macro 'COPY'
decode.c: #define COPY _mm256_set_epi64x(0x0303030303030303,0x0202020202020202,0x0101010101010101,0x0000000000000000)
decode.c: ^
decode.c: decode.c:18:9: error: always_inline function '_mm256_shuffle_epi8' requires target feature 'avx2', but would be inlined into function 'crypto_decode_256x2_avx_constbranchindex' that is compiled without support for 'avx2'
decode.c: x = _mm256_shuffle_epi8(x,COPY);
decode.c: ^
decode.c: decode.c:18:9: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
decode.c: decode.c:20:31: error: always_inline function '_mm256_set1_epi64x' requires target feature 'avx', but would be inlined into function 'crypto_decode_256x2_avx_constbranchindex' that is compiled without support for 'avx'
decode.c: x = _mm256_andnot_si256(x,MASK);
decode.c: ^
decode.c: decode.c:6:14: note: expanded from macro 'MASK'
decode.c: #define MASK _mm256_set1_epi64x(0x8040201008040201)
decode.c: ^
decode.c: decode.c:20:31: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
decode.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx