Implementation notes: aarch64, gcc185, crypto_kem/lightsaber2

Computer: gcc185
Microarchitecture: aarch64; Skylark (503f0002)
Architecture: aarch64
CPU ID: 503f0002
SUPERCOP version: 20240107
Operation: crypto_kem
Primitive: lightsaber2
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
33240052833 0 070205 824 1584T:neon2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010220231212
34680019169 0 035405 824 1568T:neon2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010220231212
35250017729 0 032789 808 1552T:neon2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010220231212
39457519261 0 035293 824 1568T:neon2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010220231212
39607562113 0 079398 872 4048T:neongcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010220231212
40192519189 0 034126 856 4016T:neongcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010220231212
42075022469 0 038606 872 4032T:neongcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010220231212
44625021885 0 037814 872 4032T:neongcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010220231212
47917539037 0 056645 824 1632T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010220231212
95392512001 0 028261 824 1616T:refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010220231212
96045013021 0 029533 824 1616T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010220231212
98677511547 0 026805 808 1600T:refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024010220231212

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
SABER_indcpa.c: In file included from SABER_indcpa.c:5:
SABER_indcpa.c: In file included from ./SABER_indcpa.h:4:
SABER_indcpa.c: /usr/bin/../lib/clang/17/include/immintrin.h:14:2: error: "This header is only meant to be used on x86 and x64 architecture"
SABER_indcpa.c: 14 | #error "This header is only meant to be used on x86 and x64 architecture"
SABER_indcpa.c: | ^
SABER_indcpa.c: In file included from SABER_indcpa.c:5:
SABER_indcpa.c: In file included from ./SABER_indcpa.h:4:
SABER_indcpa.c: In file included from /usr/bin/../lib/clang/17/include/immintrin.h:17:
SABER_indcpa.c: In file included from /usr/bin/../lib/clang/17/include/x86gprintrin.h:15:
SABER_indcpa.c: /usr/bin/../lib/clang/17/include/hresetintrin.h:42:27: error: invalid input constraint 'a' in asm
SABER_indcpa.c: 42 | __asm__ ("hreset $0" :: "a"(__eax));
SABER_indcpa.c: | ^
SABER_indcpa.c: In file included from SABER_indcpa.c:5:
SABER_indcpa.c: In file included from ./SABER_indcpa.h:4:
SABER_indcpa.c: In file included from /usr/bin/../lib/clang/17/include/immintrin.h:21:
SABER_indcpa.c: /usr/bin/../lib/clang/17/include/mmintrin.h:14:2: error: "This header is only meant to be used on x86 and x64 architecture"
SABER_indcpa.c: 14 | #error "This header is only meant to be used on x86 and x64 architecture"
SABER_indcpa.c: | ^
SABER_indcpa.c: /usr/bin/../lib/clang/17/include/mmintrin.h:54:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
SABER_indcpa.c: 54 | return (__m64)__builtin_ia32_vec_init_v2si(__i, 0);
SABER_indcpa.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SABER_indcpa.c: /usr/bin/../lib/clang/17/include/mmintrin.h:133:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
SABER_indcpa.c: 133 | return (__m64)__builtin_ia32_packsswb((__v4hi)__m1, (__v4hi)__m2);
SABER_indcpa.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SABER_indcpa.c: /usr/bin/../lib/clang/17/include/mmintrin.h:163:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
SABER_indcpa.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
SABER_indcpa.c: In file included from SABER_indcpa.c:5:
SABER_indcpa.c: SABER_indcpa.h:4:10: fatal error: immintrin.h: No such file or directory
SABER_indcpa.c: #include <immintrin.h>
SABER_indcpa.c: ^~~~~~~~~~~~~
SABER_indcpa.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2_nttmul
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
SABER_indcpa.c: In file included from SABER_indcpa.c:20:
SABER_indcpa.c: In file included from ./SABER_indcpa.h:4:
SABER_indcpa.c: /usr/bin/../lib/clang/17/include/immintrin.h:14:2: error: "This header is only meant to be used on x86 and x64 architecture"
SABER_indcpa.c: 14 | #error "This header is only meant to be used on x86 and x64 architecture"
SABER_indcpa.c: | ^
SABER_indcpa.c: In file included from SABER_indcpa.c:20:
SABER_indcpa.c: In file included from ./SABER_indcpa.h:4:
SABER_indcpa.c: In file included from /usr/bin/../lib/clang/17/include/immintrin.h:17:
SABER_indcpa.c: In file included from /usr/bin/../lib/clang/17/include/x86gprintrin.h:15:
SABER_indcpa.c: /usr/bin/../lib/clang/17/include/hresetintrin.h:42:27: error: invalid input constraint 'a' in asm
SABER_indcpa.c: 42 | __asm__ ("hreset $0" :: "a"(__eax));
SABER_indcpa.c: | ^
SABER_indcpa.c: In file included from SABER_indcpa.c:20:
SABER_indcpa.c: In file included from ./SABER_indcpa.h:4:
SABER_indcpa.c: In file included from /usr/bin/../lib/clang/17/include/immintrin.h:21:
SABER_indcpa.c: /usr/bin/../lib/clang/17/include/mmintrin.h:14:2: error: "This header is only meant to be used on x86 and x64 architecture"
SABER_indcpa.c: 14 | #error "This header is only meant to be used on x86 and x64 architecture"
SABER_indcpa.c: | ^
SABER_indcpa.c: /usr/bin/../lib/clang/17/include/mmintrin.h:54:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
SABER_indcpa.c: 54 | return (__m64)__builtin_ia32_vec_init_v2si(__i, 0);
SABER_indcpa.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SABER_indcpa.c: /usr/bin/../lib/clang/17/include/mmintrin.h:133:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
SABER_indcpa.c: 133 | return (__m64)__builtin_ia32_packsswb((__v4hi)__m1, (__v4hi)__m2);
SABER_indcpa.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SABER_indcpa.c: /usr/bin/../lib/clang/17/include/mmintrin.h:163:12: error: invalid conversion between vector type '__m64' (vector of 1 'long long' value) and integer type 'int' of different size
SABER_indcpa.c: ...

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2_nttmul
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2_nttmul
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2_nttmul
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2_nttmul
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2_nttmul

Compiler output

Implementation: T:avx2_nttmul
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
SABER_indcpa.c: In file included from SABER_indcpa.c:20:
SABER_indcpa.c: SABER_indcpa.h:4:10: fatal error: immintrin.h: No such file or directory
SABER_indcpa.c: #include <immintrin.h>
SABER_indcpa.c: ^~~~~~~~~~~~~
SABER_indcpa.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2_nttmul
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2_nttmul
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2_nttmul
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:avx2_nttmul

Compiler output

Implementation: T:neon
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
poly.c: poly.c:19:3: error: call to undeclared function 'cshake128_simple'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
poly.c: 19 | cshake128_simple(buf,SABER_N,nonce,seed,SABER_NOISESEEDBYTES);
poly.c: | ^
poly.c: poly.c:34:3: error: call to undeclared function 'cshake128_simple'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
poly.c: 34 | cshake128_simple(buf0,SABER_N,nonce0,seed,SABER_NOISESEEDBYTES);
poly.c: | ^
poly.c: 2 errors generated.

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon

Compiler output

Implementation: T:neon
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
poly.c: poly.c: In function 'poly_getnoise':
poly.c: poly.c:19:3: warning: implicit declaration of function 'cshake128_simple'; did you mean 'shake128'? [-Wimplicit-function-declaration]
poly.c: cshake128_simple(buf,SABER_N,nonce,seed,SABER_NOISESEEDBYTES);
poly.c: ^~~~~~~~~~~~~~~~
poly.c: shake128

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:neon
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:neon
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:neon
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:neon

Compiler output

Implementation: T:neon2
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
SABER_indcpa.c: In file included from SABER_indcpa.c:22:
SABER_indcpa.c: ./rq_mul/neon_poly_rq_mul.c:272:15: warning: initializing 'uint16_t *' (aka 'unsigned short *') with an expression of type 'const uint16_t *' (aka 'const unsigned short *') discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
SABER_indcpa.c: 272 | uint16_t *c0 = poly,
SABER_indcpa.c: | ^ ~~~~
SABER_indcpa.c: ./rq_mul/neon_poly_rq_mul.c:273:15: warning: initializing 'uint16_t *' (aka 'unsigned short *') with an expression of type 'const uint16_t *' (aka 'const unsigned short *') discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
SABER_indcpa.c: 273 | *c1 = &poly[1 * SB1],
SABER_indcpa.c: | ^ ~~~~~~~~~~~~~~
SABER_indcpa.c: ./rq_mul/neon_poly_rq_mul.c:274:15: warning: initializing 'uint16_t *' (aka 'unsigned short *') with an expression of type 'const uint16_t *' (aka 'const unsigned short *') discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
SABER_indcpa.c: 274 | *c2 = &poly[2 * SB1],
SABER_indcpa.c: | ^ ~~~~~~~~~~~~~~
SABER_indcpa.c: ./rq_mul/neon_poly_rq_mul.c:275:15: warning: initializing 'uint16_t *' (aka 'unsigned short *') with an expression of type 'const uint16_t *' (aka 'const unsigned short *') discards qualifiers [-Wincompatible-pointer-types-discards-qualifiers]
SABER_indcpa.c: 275 | *c3 = &poly[3 * SB1],
SABER_indcpa.c: | ^ ~~~~~~~~~~~~~~
SABER_indcpa.c: 4 warnings generated.
poly.c: poly.c:19:3: error: call to undeclared function 'cshake128_simple'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
poly.c: 19 | cshake128_simple(buf,SABER_N,nonce,seed,SABER_NOISESEEDBYTES);
poly.c: | ^
poly.c: poly.c:34:3: error: call to undeclared function 'cshake128_simple'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
poly.c: 34 | cshake128_simple(buf0,SABER_N,nonce0,seed,SABER_NOISESEEDBYTES);
poly.c: | ^
poly.c: 2 errors generated.

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon2
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:neon2

Compiler output

Implementation: T:neon2
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
SABER_indcpa.c: In file included from SABER_indcpa.c:22:
SABER_indcpa.c: rq_mul/neon_poly_rq_mul.c: In function 'tc4_evaluate_neon_SB1':
SABER_indcpa.c: rq_mul/neon_poly_rq_mul.c:272:20: warning: initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
SABER_indcpa.c: uint16_t *c0 = poly,
SABER_indcpa.c: ^~~~
SABER_indcpa.c: rq_mul/neon_poly_rq_mul.c:273:20: warning: initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
SABER_indcpa.c: *c1 = &poly[1 * SB1],
SABER_indcpa.c: ^
SABER_indcpa.c: rq_mul/neon_poly_rq_mul.c:274:20: warning: initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
SABER_indcpa.c: *c2 = &poly[2 * SB1],
SABER_indcpa.c: ^
SABER_indcpa.c: rq_mul/neon_poly_rq_mul.c:275:20: warning: initialization discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers]
SABER_indcpa.c: *c3 = &poly[3 * SB1],
SABER_indcpa.c: ^
poly.c: poly.c: In function 'poly_getnoise':
poly.c: poly.c:19:3: warning: implicit declaration of function 'cshake128_simple'; did you mean 'shake128'? [-Wimplicit-function-declaration]
poly.c: cshake128_simple(buf,SABER_N,nonce,seed,SABER_NOISESEEDBYTES);
poly.c: ^~~~~~~~~~~~~~~~
poly.c: shake128

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:neon2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:neon2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:neon2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:neon2

Compiler output

Implementation: T:ref
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
try.c: libcrypto_kem_lightsaber2.a(SABER_indcpa.o):(.bss+0x0): multiple definition of `clock1'
try.c: libcrypto_kem_lightsaber2.a(kem.o):(.bss+0x0): first defined here
try.c: libcrypto_kem_lightsaber2.a(SABER_indcpa.o):(.bss+0x8): multiple definition of `clock2'
try.c: libcrypto_kem_lightsaber2.a(kem.o):(.bss+0x8): first defined here
try.c: libcrypto_kem_lightsaber2.a(SABER_indcpa.o):(.bss+0x10): multiple definition of `clock_kp_mv'
try.c: libcrypto_kem_lightsaber2.a(kem.o):(.bss+0x10): first defined here
try.c: libcrypto_kem_lightsaber2.a(SABER_indcpa.o):(.bss+0x18): multiple definition of `clock_cl_mv'
try.c: libcrypto_kem_lightsaber2.a(kem.o):(.bss+0x18): first defined here
try.c: libcrypto_kem_lightsaber2.a(SABER_indcpa.o):(.bss+0x20): multiple definition of `clock_kp_sm'
try.c: libcrypto_kem_lightsaber2.a(kem.o):(.bss+0x20): first defined here
try.c: libcrypto_kem_lightsaber2.a(SABER_indcpa.o):(.bss+0x28): multiple definition of `clock_cl_sm'
try.c: libcrypto_kem_lightsaber2.a(kem.o):(.bss+0x28): first defined here
try.c: clang: error: linker command failed with exit code 1 (use -v to see invocation)

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref