Implementation notes: aarch64, pi3aplus, crypto_aead/pi16cipher096v2

Computer: pi3aplus
Microarchitecture: aarch64; Cortex-A53 (410fd034)
Architecture: aarch64
CPU ID: 410fd034
SUPERCOP version: 20240107
Operation: crypto_aead
Primitive: pi16cipher096v2
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
17431817035 8 030736 872 864T:goptvgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121720231212
21364714795 8 027215 864 848T:goptvgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121720231212
34380015539 8 028063 864 848T:goptvgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121720231212
5877346491 8 020357 776 856T:ref3clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121720231212
6349598263 8 022117 776 856T:ref2clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121720231212
7158737027 8 018695 848 840T:goptvgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121720231212
9109157875 8 021544 872 864T:ref3gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121720231212
10466575455 8 017839 864 848T:ref3gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121720231212
11254444395 8 015975 848 840T:ref3gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121720231212
15105504739 8 017255 864 848T:ref3gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121720231212
241903210443 8 024104 872 864T:ref2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121720231212
37506105411 8 017823 864 848T:ref2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121720231212
47738714663 8 016247 848 840T:ref2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121720231212
55212585087 8 017599 864 848T:ref2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121720231212

Test failure

Implementation: T:optimized_nonSSE
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
error 111
crypto_aead_encrypt returns more than crypto_aead_ABYTES extra bytes

Number of similar (compiler,implementation) pairs: 10, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:optimized_nonSSE
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:optimized_nonSSE
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:optimized_nonSSE
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:optimized_nonSSE
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:optimized_nonSSE
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref

Compiler output

Implementation: T:goptv
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
pi-cipher.c: pi-cipher.c:286:9: error: use of unknown builtin '__builtin_shuffle' [-Wimplicit-function-declaration]
pi-cipher.c: n_t += __builtin_shuffle(y, g_mask);
pi-cipher.c: ^
pi-cipher.c: pi-cipher.c:286:6: error: cannot convert between scalar type 'int' and vector type 'vchunk_t' (vector of 4 'word_t' values) as implicit conversion would cause truncation
pi-cipher.c: n_t += __builtin_shuffle(y, g_mask);
pi-cipher.c: ^
pi-cipher.c: pi-cipher.c:287:6: error: cannot convert between scalar type 'int' and vector type 'vchunk_t' (vector of 4 'word_t' values) as implicit conversion would cause truncation
pi-cipher.c: n_t += __builtin_shuffle(y, n_mask);
pi-cipher.c: ^
pi-cipher.c: pi-cipher.c:290:6: error: cannot convert between scalar type 'int' and vector type 'vchunk_t' (vector of 4 'word_t' values) as implicit conversion would cause truncation
pi-cipher.c: n_t ^= __builtin_shuffle(n_t, n_x_1) ^ __builtin_shuffle(n_t, n_x_2);
pi-cipher.c: ^
pi-cipher.c: pi-cipher.c:305:9: error: use of unknown builtin '__builtin_shuffle' [-Wimplicit-function-declaration]
pi-cipher.c: m_t += __builtin_shuffle(x, g_mask);
pi-cipher.c: ^
pi-cipher.c: pi-cipher.c:305:6: error: cannot convert between scalar type 'int' and vector type 'vchunk_t' (vector of 4 'word_t' values) as implicit conversion would cause truncation
pi-cipher.c: m_t += __builtin_shuffle(x, g_mask);
pi-cipher.c: ^
pi-cipher.c: pi-cipher.c:306:6: error: cannot convert between scalar type 'int' and vector type 'vchunk_t' (vector of 4 'word_t' values) as implicit conversion would cause truncation
pi-cipher.c: m_t += __builtin_shuffle(x, m_mask);
pi-cipher.c: ^
pi-cipher.c: pi-cipher.c:309:6: error: cannot convert between scalar type 'int' and vector type 'vchunk_t' (vector of 4 'word_t' values) as implicit conversion would cause truncation
pi-cipher.c: m_t ^= __builtin_shuffle(m_t, m_x_1) ^ __builtin_shuffle(m_t, m_x_2);
pi-cipher.c: ^
pi-cipher.c: pi-cipher.c:354:9: error: use of unknown builtin '__builtin_shuffle' [-Wimplicit-function-declaration]
pi-cipher.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:goptv

Namespace violations

Implementation: T:goptv
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
pi-cipher.o PI_DECRYPT_BLOCK_Q T
pi-cipher.o pi16_decrypt_block T
pi-cipher.o pi16_decrypt_last_block T
pi-cipher.o pi16_decrypt_simple T
pi-cipher.o pi16_decrypt_smn T
pi-cipher.o pi16_encrypt_block T
pi-cipher.o pi16_encrypt_block_q T
pi-cipher.o pi16_encrypt_last_block T
pi-cipher.o pi16_encrypt_simple T
pi-cipher.o pi16_encrypt_smn T
pi-cipher.o pi16_extract_tag T
pi-cipher.o pi16_init T
pi-cipher.o pi16_process_ad_block T
pi-cipher.o pi16_process_ad_block_q T
pi-cipher.o pi16_process_ad_last_block T
pi-cipher.o pi_cipher_name D

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:goptv
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:goptv
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:goptv
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:goptv

Namespace violations

Implementation: T:ref2
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
pi-cipher.o pi16_decrypt_block T
pi-cipher.o pi16_decrypt_last_block T
pi-cipher.o pi16_decrypt_simple T
pi-cipher.o pi16_decrypt_smn T
pi-cipher.o pi16_encrypt_block T
pi-cipher.o pi16_encrypt_last_block T
pi-cipher.o pi16_encrypt_simple T
pi-cipher.o pi16_extract_tag T
pi-cipher.o pi16_init T
pi-cipher.o pi16_process_ad_block T
pi-cipher.o pi16_process_ad_last_block T
pi-cipher.o pi16_process_smn T
pi-cipher.o pi_cipher_name D

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref2
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref2

Namespace violations

Implementation: T:ref3
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
pi-cipher.o pi16_decrypt_block T
pi-cipher.o pi16_decrypt_last_block T
pi-cipher.o pi16_decrypt_simple T
pi-cipher.o pi16_decrypt_smn T
pi-cipher.o pi16_encrypt_block T
pi-cipher.o pi16_encrypt_last_block T
pi-cipher.o pi16_encrypt_simple T
pi-cipher.o pi16_encrypt_smn T
pi-cipher.o pi16_extract_tag T
pi-cipher.o pi16_init T
pi-cipher.o pi16_process_ad_block T
pi-cipher.o pi16_process_ad_last_block T
pi-cipher.o pi_cipher_name D

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:ref3
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref3
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref3
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref3
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:ref3