Implementation notes: aarch64, supercoplxc, crypto_kem/ntskem13136

Computer: supercoplxc
Architecture: aarch64
CPU ID: 410fd034
SUPERCOP version: 20190816
Operation: crypto_kem
Primitive: ntskem13136
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
935584083304 84 16102443 1016 1576optgcc_-funroll-loops_-O2_-fomit-frame-pointer2019113020190816
9829440100092 84 16119723 1024 1600optgcc_-funroll-loops_-O3_-fomit-frame-pointer2019113020190816
984864062876 84 1679963 1016 1576optgcc_-O2_-fomit-frame-pointer2019113020190816
1078264088673 84 16107518 928 1600optclang_-O3_-fwrapv_-mavx_-fomit-frame-pointer_-Qunused-arguments2019113020190816
1101000087412 84 16105739 1024 1600optgcc_-O3_-fomit-frame-pointer2019113020190816
1115928094509 84 16113334 928 1600optclang_-mcpu=native_-mfpu=neon_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments2019113020190816
1129272082488 84 16101707 1016 1576optgcc_-funroll-loops_-fno-schedule-insns_-O2_-fomit-frame-pointer2019113020190816
1136448099180 84 16118795 1024 1600optgcc_-funroll-loops_-fno-schedule-insns_-O3_-fomit-frame-pointer2019113020190816
1146568088741 84 16107534 928 1600optclang_-O3_-fomit-frame-pointer_-Qunused-arguments2019113020190816
1201712088673 84 16107518 928 1600optclang_-O3_-fwrapv_-mavx2_-fomit-frame-pointer_-Qunused-arguments2019113020190816
1209088088673 84 16107518 928 1600optclang_-O3_-fwrapv_-mavx_-maes_-mpclmul_-fomit-frame-pointer_-Qunused-arguments2019113020190816
1221848061948 84 1678987 1016 1576optgcc_-fno-schedule-insns_-O2_-fomit-frame-pointer2019113020190816
1237840086592 84 16104891 1024 1600optgcc_-fno-schedule-insns_-O3_-fomit-frame-pointer2019113020190816
1285576058809 84 1675056 1000 1560optgcc_-fno-schedule-insns_-Os_-fomit-frame-pointer2019113020190816
1291008058873 84 1675168 1000 1560optgcc_-funroll-loops_-Os_-fomit-frame-pointer2019113020190816
1292144058873 84 1675168 1000 1560optgcc_-funroll-loops_-fno-schedule-insns_-Os_-fomit-frame-pointer2019113020190816
1386176058809 84 1675056 1000 1560optgcc_-Os_-fomit-frame-pointer2019113020190816
1469680064076 84 1681179 1016 1576optgcc_-O_-fomit-frame-pointer2019113020190816
1550808064076 84 1681179 1016 1576optgcc_-fno-schedule-insns_-O_-fomit-frame-pointer2019113020190816
1554888083792 84 16104123 1016 1576optgcc_-funroll-loops_-O_-fomit-frame-pointer2019113020190816
1563096083792 84 16104123 1016 1576optgcc_-funroll-loops_-fno-schedule-insns_-O_-fomit-frame-pointer2019113020190816
5852104049172 76 1668771 1008 1600refgcc_-funroll-loops_-O3_-fomit-frame-pointer2019113020190816
5855760030389 76 1649326 920 1600refclang_-mcpu=native_-mfpu=neon_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments2019113020190816
5875984030693 76 1649598 920 1600refclang_-O3_-fomit-frame-pointer_-Qunused-arguments2019113020190816
5917000030533 76 1649494 920 1600refclang_-O3_-fwrapv_-mavx2_-fomit-frame-pointer_-Qunused-arguments2019113020190816
5936112030533 76 1649494 920 1600refclang_-O3_-fwrapv_-mavx_-maes_-mpclmul_-fomit-frame-pointer_-Qunused-arguments2019113020190816
5967424037092 76 1656179 1008 1576refgcc_-funroll-loops_-O2_-fomit-frame-pointer2019113020190816
5974208030533 76 1649494 920 1600refclang_-O3_-fwrapv_-mavx_-fomit-frame-pointer_-Qunused-arguments2019113020190816
6046096036164 76 1654427 1008 1600refgcc_-O3_-fomit-frame-pointer2019113020190816
6118888048736 76 1668323 1008 1600refgcc_-funroll-loops_-fno-schedule-insns_-O3_-fomit-frame-pointer2019113020190816
6179760036884 76 1656043 1008 1576refgcc_-funroll-loops_-fno-schedule-insns_-O2_-fomit-frame-pointer2019113020190816
6550408023692 76 1640739 1008 1576refgcc_-O2_-fomit-frame-pointer2019113020190816
7161224035904 76 1654123 1008 1600refgcc_-fno-schedule-insns_-O3_-fomit-frame-pointer2019113020190816
7293368023620 76 1640611 1008 1576refgcc_-fno-schedule-insns_-O2_-fomit-frame-pointer2019113020190816
7466640020617 76 1636840 992 1560refgcc_-funroll-loops_-fno-schedule-insns_-Os_-fomit-frame-pointer2019113020190816
7496704020617 76 1636840 992 1560refgcc_-funroll-loops_-Os_-fomit-frame-pointer2019113020190816
7822992020501 76 1636696 992 1560refgcc_-fno-schedule-insns_-Os_-fomit-frame-pointer2019113020190816
7934896020501 76 1636696 992 1560refgcc_-Os_-fomit-frame-pointer2019113020190816
7955848035112 76 1655419 1008 1576refgcc_-funroll-loops_-O_-fomit-frame-pointer2019113020190816
8102520035112 76 1655419 1008 1576refgcc_-funroll-loops_-fno-schedule-insns_-O_-fomit-frame-pointer2019113020190816
8650272023816 76 1640883 1008 1576refgcc_-O_-fomit-frame-pointer2019113020190816
8691032023816 76 1640883 1008 1576refgcc_-fno-schedule-insns_-O_-fomit-frame-pointer2019113020190816
33237056041972 76 1662147 992 1576refcc2019113020190816
33512328041972 76 1662147 992 1576refgcc_-funroll-loops2019113020190816
33568232041972 76 1662147 992 1576refgcc2019113020190816

Compiler output

Implementation: avx2
Security model: unknown
Compiler: cc
bitslice_bma_128.c: In file included from bitslice_bma_128.c:18:
bitslice_bma_128.c: bitslice_bma_128.h:18:10: fatal error: immintrin.h: No such file or directory
bitslice_bma_128.c: #include <immintrin.h>
bitslice_bma_128.c: ^~~~~~~~~~~~~
bitslice_bma_128.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 19, namely:
CompilerImplementations
cc avx2
gcc avx2
gcc -O2 -fomit-frame-pointer avx2
gcc -O3 -fomit-frame-pointer avx2
gcc -O -fomit-frame-pointer avx2
gcc -Os -fomit-frame-pointer avx2
gcc -fno-schedule-insns -O2 -fomit-frame-pointer avx2
gcc -fno-schedule-insns -O3 -fomit-frame-pointer avx2
gcc -fno-schedule-insns -O -fomit-frame-pointer avx2
gcc -fno-schedule-insns -Os -fomit-frame-pointer avx2
gcc -funroll-loops avx2
gcc -funroll-loops -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -O -fomit-frame-pointer avx2
gcc -funroll-loops -Os -fomit-frame-pointer avx2
gcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer avx2
gcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer avx2
gcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer avx2
gcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer avx2

Compiler output

Implementation: avx2
Security model: unknown
Compiler: clang -O3 -fomit-frame-pointer -Qunused-arguments
bitslice_bma_128.c: In file included from bitslice_bma_128.c:18:
bitslice_bma_128.c: In file included from ./bitslice_bma_128.h:18:
bitslice_bma_128.c: In file included from /usr/lib/llvm-7/lib/clang/7.0.1/include/immintrin.h:28:
bitslice_bma_128.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:64:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
bitslice_bma_128.c: return (__m64)__builtin_ia32_vec_init_v2si(__i, 0);
bitslice_bma_128.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bitslice_bma_128.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:143:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
bitslice_bma_128.c: return (__m64)__builtin_ia32_packsswb((__v4hi)__m1, (__v4hi)__m2);
bitslice_bma_128.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bitslice_bma_128.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:173:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
bitslice_bma_128.c: return (__m64)__builtin_ia32_packssdw((__v2si)__m1, (__v2si)__m2);
bitslice_bma_128.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bitslice_bma_128.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:203:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
bitslice_bma_128.c: return (__m64)__builtin_ia32_packuswb((__v4hi)__m1, (__v4hi)__m2);
bitslice_bma_128.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bitslice_bma_128.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:230:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
bitslice_bma_128.c: return (__m64)__builtin_ia32_punpckhbw((__v8qi)__m1, (__v8qi)__m2);
bitslice_bma_128.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bitslice_bma_128.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:253:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
bitslice_bma_128.c: return (__m64)__builtin_ia32_punpckhwd((__v4hi)__m1, (__v4hi)__m2);
bitslice_bma_128.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bitslice_bma_128.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:274:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
bitslice_bma_128.c: return (__m64)__builtin_ia32_punpckhdq((__v2si)__m1, (__v2si)__m2);
bitslice_bma_128.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bitslice_bma_128.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:301:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
bitslice_bma_128.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -O3 -fomit-frame-pointer -Qunused-arguments avx2
clang -O3 -fwrapv -mavx2 -fomit-frame-pointer -Qunused-arguments avx2
clang -O3 -fwrapv -mavx -fomit-frame-pointer -Qunused-arguments avx2
clang -O3 -fwrapv -mavx -maes -mpclmul -fomit-frame-pointer -Qunused-arguments avx2
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments avx2

Compiler output

Implementation: opt
Security model: unknown
Compiler: cc
keccak.c: cc: fatal error: Killed signal terminated program cc1
keccak.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
cc opt

Compiler output

Implementation: opt
Security model: unknown
Compiler: gcc
keccak.c: gcc: fatal error: Killed signal terminated program cc1
keccak.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 2, namely:
CompilerImplementations
gcc opt
gcc -funroll-loops opt

Compiler output

Implementation: sse2
Security model: unknown
Compiler: cc
bitslice_bma_128.c: In file included from bitslice_bma_128.c:17:
bitslice_bma_128.c: bits.h:47:9: error: unknown type name '__m128i'
bitslice_bma_128.c: typedef __m128i vector;
bitslice_bma_128.c: ^~~~~~~
bitslice_bma_128.c: bits.h: In function 'vector_popcount':
bitslice_bma_128.c: bits.h:98:11: error: unknown type name '__m128i'
bitslice_bma_128.c: const __m128i a_hi = _mm_unpackhi_epi64(a, a);
bitslice_bma_128.c: ^~~~~~~
bitslice_bma_128.c: bits.h:98:26: warning: implicit declaration of function '_mm_unpackhi_epi64' [-Wimplicit-function-declaration]
bitslice_bma_128.c: const __m128i a_hi = _mm_unpackhi_epi64(a, a);
bitslice_bma_128.c: ^~~~~~~~~~~~~~~~~~
bitslice_bma_128.c: bits.h:99:21: warning: implicit declaration of function '_mm_cvtsi128_si64' [-Wimplicit-function-declaration]
bitslice_bma_128.c: return popcount(_mm_cvtsi128_si64(a_hi)) + popcount(_mm_cvtsi128_si64(a));
bitslice_bma_128.c: ^~~~~~~~~~~~~~~~~
bitslice_bma_128.c: bits.h:93:42: note: in definition of macro 'popcount'
bitslice_bma_128.c: #define popcount(x) __builtin_popcountll(x)
bitslice_bma_128.c: ^
bitslice_bma_128.c: In file included from bitslice_bma_128.c:18:
bitslice_bma_128.c: bitslice_bma_128.h: At top level:
bitslice_bma_128.c: bitslice_bma_128.h:18:10: fatal error: immintrin.h: No such file or directory
bitslice_bma_128.c: #include <immintrin.h>
bitslice_bma_128.c: ^~~~~~~~~~~~~
bitslice_bma_128.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 19, namely:
CompilerImplementations
cc sse2
gcc sse2
gcc -O2 -fomit-frame-pointer sse2
gcc -O3 -fomit-frame-pointer sse2
gcc -O -fomit-frame-pointer sse2
gcc -Os -fomit-frame-pointer sse2
gcc -fno-schedule-insns -O2 -fomit-frame-pointer sse2
gcc -fno-schedule-insns -O3 -fomit-frame-pointer sse2
gcc -fno-schedule-insns -O -fomit-frame-pointer sse2
gcc -fno-schedule-insns -Os -fomit-frame-pointer sse2
gcc -funroll-loops sse2
gcc -funroll-loops -O2 -fomit-frame-pointer sse2
gcc -funroll-loops -O3 -fomit-frame-pointer sse2
gcc -funroll-loops -O -fomit-frame-pointer sse2
gcc -funroll-loops -Os -fomit-frame-pointer sse2
gcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer sse2
gcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer sse2
gcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer sse2
gcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer sse2

Compiler output

Implementation: sse2
Security model: unknown
Compiler: clang -O3 -fomit-frame-pointer -Qunused-arguments
bitslice_bma_128.c: In file included from bitslice_bma_128.c:17:
bitslice_bma_128.c: ./bits.h:47:9: error: unknown type name '__m128i'
bitslice_bma_128.c: typedef __m128i vector;
bitslice_bma_128.c: ^
bitslice_bma_128.c: ./bits.h:98:11: error: unknown type name '__m128i'
bitslice_bma_128.c: const __m128i a_hi = _mm_unpackhi_epi64(a, a);
bitslice_bma_128.c: ^
bitslice_bma_128.c: ./bits.h:98:26: warning: implicit declaration of function '_mm_unpackhi_epi64' is invalid in C99 [-Wimplicit-function-declaration]
bitslice_bma_128.c: const __m128i a_hi = _mm_unpackhi_epi64(a, a);
bitslice_bma_128.c: ^
bitslice_bma_128.c: ./bits.h:99:21: warning: implicit declaration of function '_mm_cvtsi128_si64' is invalid in C99 [-Wimplicit-function-declaration]
bitslice_bma_128.c: return popcount(_mm_cvtsi128_si64(a_hi)) + popcount(_mm_cvtsi128_si64(a));
bitslice_bma_128.c: ^
bitslice_bma_128.c: In file included from bitslice_bma_128.c:18:
bitslice_bma_128.c: In file included from ./bitslice_bma_128.h:18:
bitslice_bma_128.c: In file included from /usr/lib/llvm-7/lib/clang/7.0.1/include/immintrin.h:28:
bitslice_bma_128.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:64:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
bitslice_bma_128.c: return (__m64)__builtin_ia32_vec_init_v2si(__i, 0);
bitslice_bma_128.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bitslice_bma_128.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:143:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
bitslice_bma_128.c: return (__m64)__builtin_ia32_packsswb((__v4hi)__m1, (__v4hi)__m2);
bitslice_bma_128.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bitslice_bma_128.c: /usr/lib/llvm-7/lib/clang/7.0.1/include/mmintrin.h:173:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
bitslice_bma_128.c: return (__m64)__builtin_ia32_packssdw((__v2si)__m1, (__v2si)__m2);
bitslice_bma_128.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
bitslice_bma_128.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -O3 -fomit-frame-pointer -Qunused-arguments sse2
clang -O3 -fwrapv -mavx2 -fomit-frame-pointer -Qunused-arguments sse2
clang -O3 -fwrapv -mavx -fomit-frame-pointer -Qunused-arguments sse2
clang -O3 -fwrapv -mavx -maes -mpclmul -fomit-frame-pointer -Qunused-arguments sse2
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments sse2