Implementation notes: aarch64, pi4b, crypto_kem/ntskem1380

Computer: pi4b
Architecture: aarch64
CPU ID: 410fd083
SUPERCOP version: 20221019
Operation: crypto_kem
Primitive: ntskem1380
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
223621269744 84 1687934 952 1584T:optgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101220221005
226010857640 84 1674918 944 1568T:optgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101220221005
244460656676 84 1673758 944 1568T:optgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101220221005
257751657997 84 1673941 928 1552T:optgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101220221005
4487931110085 84 16128788 968 1568T:optclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101220221005
1467687930616 76 1648814 936 1584T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101220221005
1661720824652 76 1641902 936 1568T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101220221005
1716896233593 76 1652436 960 1568T:refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022101220221005
1740897223744 76 1640798 936 1568T:refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101220221005
1848900020857 76 1636765 920 1552T:refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022101220221005

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
bitslice_bma_128.c: In file included from bitslice_bma_128.c:17:
bitslice_bma_128.c: In file included from ./bitslice_bma_128.h:18:
bitslice_bma_128.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/immintrin.h:14:2: error: "This header is only meant to be used on x86 and x64 architecture"
bitslice_bma_128.c: #error "This header is only meant to be used on x86 and x64 architecture"
bitslice_bma_128.c: ^
bitslice_bma_128.c: In file included from bitslice_bma_128.c:17:
bitslice_bma_128.c: In file included from ./bitslice_bma_128.h:18:
bitslice_bma_128.c: In file included from /usr/lib/llvm-14/lib/clang/14.0.0/include/immintrin.h:17:
bitslice_bma_128.c: In file included from /usr/lib/llvm-14/lib/clang/14.0.0/include/x86gprintrin.h:15:
bitslice_bma_128.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/hresetintrin.h:42:27: error: invalid input constraint 'a' in asm
bitslice_bma_128.c: __asm__ ("hreset $0" :: "a"(__eax));
bitslice_bma_128.c: ^
bitslice_bma_128.c: In file included from bitslice_bma_128.c:17:
bitslice_bma_128.c: In file included from ./bitslice_bma_128.h:18:
bitslice_bma_128.c: In file included from /usr/lib/llvm-14/lib/clang/14.0.0/include/immintrin.h:21:
bitslice_bma_128.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/mmintrin.h:14:2: error: "This header is only meant to be used on x86 and x64 architecture"
bitslice_bma_128.c: #error "This header is only meant to be used on x86 and x64 architecture"
bitslice_bma_128.c: ^
bitslice_bma_128.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/mmintrin.h:54:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
bitslice_bma_128.c: return (__m64)__builtin_ia32_vec_init_v2si(__i, 0);
bitslice_bma_128.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bitslice_bma_128.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/mmintrin.h:133:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
bitslice_bma_128.c: return (__m64)__builtin_ia32_packsswb((__v4hi)__m1, (__v4hi)__m2);
bitslice_bma_128.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bitslice_bma_128.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/mmintrin.h:163:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
bitslice_bma_128.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
bitslice_bma_128.c: In file included from bitslice_bma_128.c:17:
bitslice_bma_128.c: bitslice_bma_128.h:18:10: fatal error: immintrin.h: No such file or directory
bitslice_bma_128.c: 18 | #include <immintrin.h>
bitslice_bma_128.c: | ^~~~~~~~~~~~~
bitslice_bma_128.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2

Compiler output

Implementation: T:sse2
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
bitslice_bma_128.c: In file included from bitslice_bma_128.c:17:
bitslice_bma_128.c: ./bits.h:47:9: error: unknown type name '__m128i'
bitslice_bma_128.c: typedef __m128i vector;
bitslice_bma_128.c: ^
bitslice_bma_128.c: ./bits.h:98:11: error: unknown type name '__m128i'
bitslice_bma_128.c: const __m128i a_hi = _mm_unpackhi_epi64(a, a);
bitslice_bma_128.c: ^
bitslice_bma_128.c: ./bits.h:98:26: warning: implicit declaration of function '_mm_unpackhi_epi64' is invalid in C99 [-Wimplicit-function-declaration]
bitslice_bma_128.c: const __m128i a_hi = _mm_unpackhi_epi64(a, a);
bitslice_bma_128.c: ^
bitslice_bma_128.c: ./bits.h:99:21: warning: implicit declaration of function '_mm_cvtsi128_si64' is invalid in C99 [-Wimplicit-function-declaration]
bitslice_bma_128.c: return popcount(_mm_cvtsi128_si64(a_hi)) + popcount(_mm_cvtsi128_si64(a));
bitslice_bma_128.c: ^
bitslice_bma_128.c: In file included from bitslice_bma_128.c:18:
bitslice_bma_128.c: In file included from ./bitslice_bma_128.h:18:
bitslice_bma_128.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/immintrin.h:14:2: error: "This header is only meant to be used on x86 and x64 architecture"
bitslice_bma_128.c: #error "This header is only meant to be used on x86 and x64 architecture"
bitslice_bma_128.c: ^
bitslice_bma_128.c: In file included from bitslice_bma_128.c:18:
bitslice_bma_128.c: In file included from ./bitslice_bma_128.h:18:
bitslice_bma_128.c: In file included from /usr/lib/llvm-14/lib/clang/14.0.0/include/immintrin.h:17:
bitslice_bma_128.c: In file included from /usr/lib/llvm-14/lib/clang/14.0.0/include/x86gprintrin.h:15:
bitslice_bma_128.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/hresetintrin.h:42:27: error: invalid input constraint 'a' in asm
bitslice_bma_128.c: __asm__ ("hreset $0" :: "a"(__eax));
bitslice_bma_128.c: ^
bitslice_bma_128.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sse2

Compiler output

Implementation: T:sse2
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
bitslice_bma_128.c: In file included from bitslice_bma_128.c:17:
bitslice_bma_128.c: bits.h:47:9: error: unknown type name '__m128i'
bitslice_bma_128.c: 47 | typedef __m128i vector;
bitslice_bma_128.c: | ^~~~~~~
bitslice_bma_128.c: bits.h: In function 'vector_popcount':
bitslice_bma_128.c: bits.h:98:11: error: unknown type name '__m128i'
bitslice_bma_128.c: 98 | const __m128i a_hi = _mm_unpackhi_epi64(a, a);
bitslice_bma_128.c: | ^~~~~~~
bitslice_bma_128.c: bits.h:98:26: warning: implicit declaration of function '_mm_unpackhi_epi64' [-Wimplicit-function-declaration]
bitslice_bma_128.c: 98 | const __m128i a_hi = _mm_unpackhi_epi64(a, a);
bitslice_bma_128.c: | ^~~~~~~~~~~~~~~~~~
bitslice_bma_128.c: bits.h:99:21: warning: implicit declaration of function '_mm_cvtsi128_si64' [-Wimplicit-function-declaration]
bitslice_bma_128.c: 99 | return popcount(_mm_cvtsi128_si64(a_hi)) + popcount(_mm_cvtsi128_si64(a));
bitslice_bma_128.c: | ^~~~~~~~~~~~~~~~~
bitslice_bma_128.c: bits.h:93:42: note: in definition of macro 'popcount'
bitslice_bma_128.c: 93 | #define popcount(x) __builtin_popcountll(x)
bitslice_bma_128.c: | ^
bitslice_bma_128.c: In file included from bitslice_bma_128.c:18:
bitslice_bma_128.c: bitslice_bma_128.h: At top level:
bitslice_bma_128.c: bitslice_bma_128.h:18:10: fatal error: immintrin.h: No such file or directory
bitslice_bma_128.c: 18 | #include <immintrin.h>
bitslice_bma_128.c: | ^~~~~~~~~~~~~
bitslice_bma_128.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sse2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sse2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sse2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sse2