Implementation notes: aarch64, pi4b, crypto_kem/saber2

Computer: pi4b
Microarchitecture: aarch64; Cortex-A72 (410fd083)
Architecture: aarch64
CPU ID: 410fd083
SUPERCOP version: 20240716
Operation: crypto_kem
Primitive: saber2
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
41232252465 0 072722 920 1568T:neon2clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024070720240625
48341250125 0 246470275 912 4032T:neonclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024070720240625
49763053129 0 246477235 880 4048T:neongcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070720240625
55450259055 0 079656 912 1568T:refclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024070720240625
55918123361 0 246442227 880 4032T:neongcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070720240625
56718235865 0 056370 888 1584T:refgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070720240625
56807119113 0 246436859 864 4016T:neongcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070720240625
56872022957 0 246441923 880 4032T:neongcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070720240625
218957213593 0 033042 888 1568T:refgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070720240625
221800812909 0 032258 888 1568T:refgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070720240625
224644412007 0 030202 872 1552T:refgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070720240625

Compiler output


SABER_indcpa.c: In file included from SABER_indcpa.c:5:
SABER_indcpa.c: In file included from ./SABER_indcpa.h:4:
SABER_indcpa.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/immintrin.h:14:2: error: "This header is only meant to be used on x86 and x64 architecture"
SABER_indcpa.c: #error "This header is only meant to be used on x86 and x64 architecture"
SABER_indcpa.c:  ^
SABER_indcpa.c: In file included from SABER_indcpa.c:5:
SABER_indcpa.c: In file included from ./SABER_indcpa.h:4:
SABER_indcpa.c: In file included from /usr/lib/llvm-14/lib/clang/14.0.0/include/immintrin.h:17:
SABER_indcpa.c: In file included from /usr/lib/llvm-14/lib/clang/14.0.0/include/x86gprintrin.h:15:
SABER_indcpa.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/hresetintrin.h:42:27: error: invalid input constraint 'a' in asm
SABER_indcpa.c:   __asm__ ("hreset $0" :: "a"(__eax));
SABER_indcpa.c:                           ^
SABER_indcpa.c: In file included from SABER_indcpa.c:5:
SABER_indcpa.c: In file included from ./SABER_indcpa.h:4:
SABER_indcpa.c: In file included from /usr/lib/llvm-14/lib/clang/14.0.0/include/immintrin.h:21:
SABER_indcpa.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/mmintrin.h:14:2: error: "This header is only meant to be used on x86 and x64 architecture"
SABER_indcpa.c: #error "This header is only meant to be used on x86 and x64 architecture"
SABER_indcpa.c:  ^
SABER_indcpa.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/mmintrin.h:54:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
SABER_indcpa.c:     return (__m64)__builtin_ia32_vec_init_v2si(__i, 0);
SABER_indcpa.c:            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SABER_indcpa.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/mmintrin.h:133:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
SABER_indcpa.c:     return (__m64)__builtin_ia32_packsswb((__v4hi)__m1, (__v4hi)__m2);
SABER_indcpa.c:            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SABER_indcpa.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/mmintrin.h:163:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
SABER_indcpa.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:avx2clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


SABER_indcpa.c: In file included from SABER_indcpa.c:5:
SABER_indcpa.c: SABER_indcpa.h:4:10: fatal error: immintrin.h: No such file or directory
SABER_indcpa.c:     4 | #include <immintrin.h>
SABER_indcpa.c:       |          ^~~~~~~~~~~~~
SABER_indcpa.c: compilation terminated.

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:avx2gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Compiler output


SABER_indcpa.c: In file included from SABER_indcpa.c:20:
SABER_indcpa.c: In file included from ./SABER_indcpa.h:4:
SABER_indcpa.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/immintrin.h:14:2: error: "This header is only meant to be used on x86 and x64 architecture"
SABER_indcpa.c: #error "This header is only meant to be used on x86 and x64 architecture"
SABER_indcpa.c:  ^
SABER_indcpa.c: In file included from SABER_indcpa.c:20:
SABER_indcpa.c: In file included from ./SABER_indcpa.h:4:
SABER_indcpa.c: In file included from /usr/lib/llvm-14/lib/clang/14.0.0/include/immintrin.h:17:
SABER_indcpa.c: In file included from /usr/lib/llvm-14/lib/clang/14.0.0/include/x86gprintrin.h:15:
SABER_indcpa.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/hresetintrin.h:42:27: error: invalid input constraint 'a' in asm
SABER_indcpa.c:   __asm__ ("hreset $0" :: "a"(__eax));
SABER_indcpa.c:                           ^
SABER_indcpa.c: In file included from SABER_indcpa.c:20:
SABER_indcpa.c: In file included from ./SABER_indcpa.h:4:
SABER_indcpa.c: In file included from /usr/lib/llvm-14/lib/clang/14.0.0/include/immintrin.h:21:
SABER_indcpa.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/mmintrin.h:14:2: error: "This header is only meant to be used on x86 and x64 architecture"
SABER_indcpa.c: #error "This header is only meant to be used on x86 and x64 architecture"
SABER_indcpa.c:  ^
SABER_indcpa.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/mmintrin.h:54:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
SABER_indcpa.c:     return (__m64)__builtin_ia32_vec_init_v2si(__i, 0);
SABER_indcpa.c:            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SABER_indcpa.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/mmintrin.h:133:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
SABER_indcpa.c:     return (__m64)__builtin_ia32_packsswb((__v4hi)__m1, (__v4hi)__m2);
SABER_indcpa.c:            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SABER_indcpa.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/mmintrin.h:163:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
SABER_indcpa.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:avx2_nttmulclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


SABER_indcpa.c: In file included from SABER_indcpa.c:20:
SABER_indcpa.c: SABER_indcpa.h:4:10: fatal error: immintrin.h: No such file or directory
SABER_indcpa.c:     4 | #include <immintrin.h>
SABER_indcpa.c:       |          ^~~~~~~~~~~~~
SABER_indcpa.c: compilation terminated.

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:avx2_nttmulgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2_nttmulgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2_nttmulgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avx2_nttmulgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Compiler output


poly.c: poly.c:19:3: warning: implicit declaration of function 'cshake128_simple' is invalid in C99 [-Wimplicit-function-declaration]
poly.c:   cshake128_simple(buf,SABER_N,nonce,seed,SABER_NOISESEEDBYTES);
poly.c:   ^
poly.c: poly.c:34:3: warning: implicit declaration of function 'cshake128_simple' is invalid in C99 [-Wimplicit-function-declaration]
poly.c:   cshake128_simple(buf0,SABER_N,nonce0,seed,SABER_NOISESEEDBYTES);
poly.c:   ^
poly.c: 2 warnings generated.

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:neonclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


SABER_indcpa.c: In file included from polymul/toom_cook_4/asimd_toom_cook_4way_neon.c:22,
SABER_indcpa.c:                  from SABER_indcpa.c:35:
SABER_indcpa.c: SABER_indcpa.c: In function 'toom_cook_4way_neon':
SABER_indcpa.c: polymul/toom_cook_4/batch_64coefficient_multiplications.c:76:20: warning: 'w1' is used uninitialized [-Wuninitialized]
SABER_indcpa.c:    76 |         c.val[0] = veorq_u16(a.val[0], b.val[0]); \
SABER_indcpa.c:       |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SABER_indcpa.c: polymul/toom_cook_4/asimd_toom_cook_4way_neon.c:77:9: note: in expansion of macro 'vxor'
SABER_indcpa.c:    77 |         vxor(int0_avx, w1, w1);
SABER_indcpa.c:       |         ^~~~
SABER_indcpa.c: In file included from SABER_indcpa.c:35:
SABER_indcpa.c: polymul/toom_cook_4/asimd_toom_cook_4way_neon.c:70:18: note: 'w1' declared here
SABER_indcpa.c:    70 |     uint16x8x2_t w1, w2, w3, w4, w5, w6, w7;
SABER_indcpa.c:       |                  ^~
poly.c: poly.c: In function 'poly_getnoise':
poly.c: poly.c:19:3: warning: implicit declaration of function 'cshake128_simple' [-Wimplicit-function-declaration]
poly.c:    19 |   cshake128_simple(buf,SABER_N,nonce,seed,SABER_NOISESEEDBYTES);
poly.c:       |   ^~~~~~~~~~~~~~~~

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:neongcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:neongcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:neongcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:neongcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Compiler output


SABER_indcpa.c: In file included from SABER_indcpa.c:22:
SABER_indcpa.c: ./rq_mul/neon_poly_rq_mul.c:272:15: warning: initializing 'uint16_t *' (aka 'unsigned short *') with an expression of type 'const uint16_t *' (aka 'const unsigned short *') discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
SABER_indcpa.c:     uint16_t *c0 = poly,
SABER_indcpa.c:               ^    ~~~~
SABER_indcpa.c: ./rq_mul/neon_poly_rq_mul.c:273:15: warning: initializing 'uint16_t *' (aka 'unsigned short *') with an expression of type 'const uint16_t *' (aka 'const unsigned short *') discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
SABER_indcpa.c:              *c1 = &poly[1 * SB1],
SABER_indcpa.c:               ^    ~~~~~~~~~~~~~~
SABER_indcpa.c: ./rq_mul/neon_poly_rq_mul.c:274:15: warning: initializing 'uint16_t *' (aka 'unsigned short *') with an expression of type 'const uint16_t *' (aka 'const unsigned short *') discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
SABER_indcpa.c:              *c2 = &poly[2 * SB1],
SABER_indcpa.c:               ^    ~~~~~~~~~~~~~~
SABER_indcpa.c: ./rq_mul/neon_poly_rq_mul.c:275:15: warning: initializing 'uint16_t *' (aka 'unsigned short *') with an expression of type 'const uint16_t *' (aka 'const unsigned short *') discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
SABER_indcpa.c:              *c3 = &poly[3 * SB1],
SABER_indcpa.c:               ^    ~~~~~~~~~~~~~~
SABER_indcpa.c: ./rq_mul/neon_poly_rq_mul.c:501:16: warning: unused variable 'mod' [-Wunused-variable]
SABER_indcpa.c:     uint16x8_t mod;
SABER_indcpa.c:                ^
SABER_indcpa.c: SABER_indcpa.c:188:17: warning: unused variable 'k' [-Wunused-variable]
SABER_indcpa.c:   int32_t i, j, k;
SABER_indcpa.c:                 ^
SABER_indcpa.c: SABER_indcpa.c:194:16: warning: unused variable 'acc_neon' [-Wunused-variable]
SABER_indcpa.c:   uint16x8x4_t acc_neon;
SABER_indcpa.c:                ^
SABER_indcpa.c: 7 warnings generated.
poly.c: poly.c:19:3: warning: implicit declaration of function 'cshake128_simple' is invalid in C99 [-Wimplicit-function-declaration]
poly.c:   cshake128_simple(buf,SABER_N,nonce,seed,SABER_NOISESEEDBYTES);
poly.c:   ^
poly.c: poly.c:34:3: warning: implicit declaration of function 'cshake128_simple' is invalid in C99 [-Wimplicit-function-declaration]
poly.c:   cshake128_simple(buf0,SABER_N,nonce0,seed,SABER_NOISESEEDBYTES);
poly.c:   ^
poly.c: 2 warnings generated.

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:neon2clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


SABER_indcpa.c: In file included from SABER_indcpa.c:22:
SABER_indcpa.c: rq_mul/neon_poly_rq_mul.c: In function 'tc4_evaluate_neon_SB1':
SABER_indcpa.c: rq_mul/neon_poly_rq_mul.c:272:20: warning: initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
SABER_indcpa.c:   272 |     uint16_t *c0 = poly,
SABER_indcpa.c:       |                    ^~~~
SABER_indcpa.c: rq_mul/neon_poly_rq_mul.c:273:20: warning: initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
SABER_indcpa.c:   273 |              *c1 = &poly[1 * SB1],
SABER_indcpa.c:       |                    ^
SABER_indcpa.c: rq_mul/neon_poly_rq_mul.c:274:20: warning: initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
SABER_indcpa.c:   274 |              *c2 = &poly[2 * SB1],
SABER_indcpa.c:       |                    ^
SABER_indcpa.c: rq_mul/neon_poly_rq_mul.c:275:20: warning: initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
SABER_indcpa.c:   275 |              *c3 = &poly[3 * SB1],
SABER_indcpa.c:       |                    ^
SABER_indcpa.c: rq_mul/neon_poly_rq_mul.c: In function 'neon_vector_vector_mul':
SABER_indcpa.c: rq_mul/neon_poly_rq_mul.c:501:16: warning: unused variable 'mod' [-Wunused-variable]
SABER_indcpa.c:   501 |     uint16x8_t mod;
SABER_indcpa.c:       |                ^~~
SABER_indcpa.c: SABER_indcpa.c: In function 'indcpa_kem_keypair':
SABER_indcpa.c: SABER_indcpa.c:194:16: warning: unused variable 'acc_neon' [-Wunused-variable]
SABER_indcpa.c:   194 |   uint16x8x4_t acc_neon;
SABER_indcpa.c:       |                ^~~~~~~~
SABER_indcpa.c: SABER_indcpa.c:188:17: warning: unused variable 'k' [-Wunused-variable]
SABER_indcpa.c:   188 |   int32_t i, j, k;
SABER_indcpa.c:       |                 ^
SABER_indcpa.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:neon2gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:neon2gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:neon2gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:neon2gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)