Implementation notes: amd64, margaux, crypto_aead/norx3261v1

Computer: margaux
Microarchitecture: amd64; Core 2 65nm (6fb)
Architecture: amd64
CPU ID: GenuineIntel-000006fb-bfebfbff
SUPERCOP version: 20240107
Operation: crypto_aead
Primitive: norx3261v1
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
746947807 0 030051 844 1024T:xmmclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1052894850 8 027299 852 1024T:refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1052954834 8 026171 852 1024T:refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1060862999 8 022277 844 1024T:refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1097724728 8 027067 852 1024T:refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1276943670 8 023153 788 1056T:refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
1313534664 8 024651 852 1024T:refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1350369154 8 032148 820 1088T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
1354684706 8 026556 820 1088T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
1443133885 8 025076 812 1088T:refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212

Compiler output

Implementation: T:xmm
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
norx.c: norx.c:430:13: error: '__builtin_ia32_pblendw128' needs target feature sse4.1
norx.c: DECRYPT_BLOCK(A, B, C, D, c, m);
norx.c: ^
norx.c: norx.c:249:60: note: expanded from macro 'DECRYPT_BLOCK'
norx.c: W2 = LOADL(IN + 32); STOREL(OUT + 32, XOR(C, W2)); C = BLEND(C, W2); \
norx.c: ^
norx.c: norx.c:55:21: note: expanded from macro 'BLEND'
norx.c: #define BLEND(A, B) _mm_blend_epi16((A), (B), 0x0F)
norx.c: ^
norx.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/smmintrin.h:520:14: note: expanded from macro '_mm_blend_epi16'
norx.c: ((__m128i) __builtin_ia32_pblendw128 ((__v8hi)(__m128i)(V1), \
norx.c: ^
norx.c: norx.c:435:9: error: '__builtin_ia32_pblendw128' needs target feature sse4.1
norx.c: DECRYPT_LASTBLOCK(A, B, C, D, c, clen, m);
norx.c: ^
norx.c: norx.c:266:73: note: expanded from macro 'DECRYPT_LASTBLOCK'
norx.c: W2 = LOADL(lastblock + 32); STOREL(lastblock + 32, XOR(C, W2)); C = BLEND(C, W2); \
norx.c: ^
norx.c: norx.c:55:21: note: expanded from macro 'BLEND'
norx.c: #define BLEND(A, B) _mm_blend_epi16((A), (B), 0x0F)
norx.c: ^
norx.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/smmintrin.h:520:14: note: expanded from macro '_mm_blend_epi16'
norx.c: ((__m128i) __builtin_ia32_pblendw128 ((__v8hi)(__m128i)(V1), \
norx.c: ^
norx.c: 2 errors generated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:xmm
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:xmm
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:xmm
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:xmm

Compiler output

Implementation: T:xmm
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
norx.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:39,
norx.c: from /usr/lib/gcc/x86_64-linux-gnu/11/include/x86intrin.h:32,
norx.c: from norx.c:27:
norx.c: norx.c: In function 'crypto_aead_norx3261v1_xmm_timingleaks_decrypt':
norx.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/smmintrin.h:166:1: error: inlining failed in call to 'always_inline' '_mm_blend_epi16': target specific option mismatch
norx.c: 166 | _mm_blend_epi16 (__m128i __X, __m128i __Y, const int __M)
norx.c: | ^~~~~~~~~~~~~~~
norx.c: norx.c:55:21: note: called from here
norx.c: 55 | #define BLEND(A, B) _mm_blend_epi16((A), (B), 0x0F)
norx.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
norx.c: norx.c:249:60: note: in expansion of macro 'BLEND'
norx.c: 249 | W2 = LOADL(IN + 32); STOREL(OUT + 32, XOR(C, W2)); C = BLEND(C, W2); \
norx.c: | ^~~~~
norx.c: norx.c:430:13: note: in expansion of macro 'DECRYPT_BLOCK'
norx.c: 430 | DECRYPT_BLOCK(A, B, C, D, c, m);
norx.c: | ^~~~~~~~~~~~~
norx.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:39,
norx.c: from /usr/lib/gcc/x86_64-linux-gnu/11/include/x86intrin.h:32,
norx.c: from norx.c:27:
norx.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/smmintrin.h:166:1: error: inlining failed in call to 'always_inline' '_mm_blend_epi16': target specific option mismatch
norx.c: 166 | _mm_blend_epi16 (__m128i __X, __m128i __Y, const int __M)
norx.c: | ^~~~~~~~~~~~~~~
norx.c: norx.c:55:21: note: called from here
norx.c: 55 | #define BLEND(A, B) _mm_blend_epi16((A), (B), 0x0F)
norx.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
norx.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:xmm
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:xmm
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:xmm
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:xmm