Implementation notes: amd64, h8atom, crypto_aead/norx3261v1

Computer: h8atom
Microarchitecture: amd64; Bonnell (30661)
Architecture: amd64
CPU ID: GenuineIntel-00030661-bfebfbff
SUPERCOP version: 20240107
Operation: crypto_aead
Primitive: norx3261v1
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
733047807 0 029612 816 856T:xmmclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1012274728 8 026628 824 856T:refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1244744281 8 025220 824 856T:refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1245094297 8 026412 824 856T:refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1343583060 8 022118 816 856T:refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1675734875 8 024596 824 856T:refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1846253670 8 022650 760 896T:refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
1976529093 8 031725 792 928T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
2043234668 8 025981 792 928T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
2293413909 8 024605 784 928T:refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212

Compiler output

Implementation: T:xmm
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
norx.c: norx.c:430:13: error: '__builtin_ia32_pblendw128' needs target feature sse4.1
norx.c: DECRYPT_BLOCK(A, B, C, D, c, m);
norx.c: ^
norx.c: norx.c:249:60: note: expanded from macro 'DECRYPT_BLOCK'
norx.c: W2 = LOADL(IN + 32); STOREL(OUT + 32, XOR(C, W2)); C = BLEND(C, W2); \
norx.c: ^
norx.c: norx.c:55:21: note: expanded from macro 'BLEND'
norx.c: #define BLEND(A, B) _mm_blend_epi16((A), (B), 0x0F)
norx.c: ^
norx.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/smmintrin.h:520:14: note: expanded from macro '_mm_blend_epi16'
norx.c: ((__m128i) __builtin_ia32_pblendw128 ((__v8hi)(__m128i)(V1), \
norx.c: ^
norx.c: norx.c:435:9: error: '__builtin_ia32_pblendw128' needs target feature sse4.1
norx.c: DECRYPT_LASTBLOCK(A, B, C, D, c, clen, m);
norx.c: ^
norx.c: norx.c:266:73: note: expanded from macro 'DECRYPT_LASTBLOCK'
norx.c: W2 = LOADL(lastblock + 32); STOREL(lastblock + 32, XOR(C, W2)); C = BLEND(C, W2); \
norx.c: ^
norx.c: norx.c:55:21: note: expanded from macro 'BLEND'
norx.c: #define BLEND(A, B) _mm_blend_epi16((A), (B), 0x0F)
norx.c: ^
norx.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/smmintrin.h:520:14: note: expanded from macro '_mm_blend_epi16'
norx.c: ((__m128i) __builtin_ia32_pblendw128 ((__v8hi)(__m128i)(V1), \
norx.c: ^
norx.c: 2 errors generated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:xmm
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:xmm
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:xmm
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:xmm

Compiler output

Implementation: T:xmm
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
norx.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:39,
norx.c: from /usr/lib/gcc/x86_64-linux-gnu/11/include/x86intrin.h:32,
norx.c: from norx.c:27:
norx.c: norx.c: In function 'crypto_aead_norx3261v1_xmm_timingleaks_decrypt':
norx.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/smmintrin.h:166:1: error: inlining failed in call to 'always_inline' '_mm_blend_epi16': target specific option mismatch
norx.c: 166 | _mm_blend_epi16 (__m128i __X, __m128i __Y, const int __M)
norx.c: | ^~~~~~~~~~~~~~~
norx.c: norx.c:55:21: note: called from here
norx.c: 55 | #define BLEND(A, B) _mm_blend_epi16((A), (B), 0x0F)
norx.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
norx.c: norx.c:249:60: note: in expansion of macro 'BLEND'
norx.c: 249 | W2 = LOADL(IN + 32); STOREL(OUT + 32, XOR(C, W2)); C = BLEND(C, W2); \
norx.c: | ^~~~~
norx.c: norx.c:430:13: note: in expansion of macro 'DECRYPT_BLOCK'
norx.c: 430 | DECRYPT_BLOCK(A, B, C, D, c, m);
norx.c: | ^~~~~~~~~~~~~
norx.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:39,
norx.c: from /usr/lib/gcc/x86_64-linux-gnu/11/include/x86intrin.h:32,
norx.c: from norx.c:27:
norx.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/smmintrin.h:166:1: error: inlining failed in call to 'always_inline' '_mm_blend_epi16': target specific option mismatch
norx.c: 166 | _mm_blend_epi16 (__m128i __X, __m128i __Y, const int __M)
norx.c: | ^~~~~~~~~~~~~~~
norx.c: norx.c:55:21: note: called from here
norx.c: 55 | #define BLEND(A, B) _mm_blend_epi16((A), (B), 0x0F)
norx.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
norx.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:xmm
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:xmm
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:xmm
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:xmm