Implementation notes: aarch64, pi4b, crypto_sign/picnic2l5fs

Computer: pi4b
Microarchitecture: aarch64; Cortex-A72 (410fd083)
Architecture: aarch64
CPU ID: 410fd083
SUPERCOP version: 20240716
Operation: crypto_sign
Primitive: picnic2l5fs
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
6312826338175997 1936 10217710 2880 1584T:optimizedct/neongcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070320240625
6323199576181927 1936 10224614 2896 1600T:optimizedct/neongcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070320240625
6349038864187447 1936 10231126 2896 1616T:optimizedct/neongcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070320240625
6461685832182011 1936 10224734 2896 1600T:optimizedct/neongcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070320240625
6566782352183735 1936 10228157 2928 1600T:optimizedct/cclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024070320240625
6826031814178647 1936 10222062 2896 1616T:optimizedct/cgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070320240625
6900407722174403 1936 10216822 2896 1600T:optimizedct/cgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070320240625
6991872116169169 1936 10210606 2880 1584T:optimizedct/cgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070320240625
7549278653174507 1936 10216942 2896 1600T:optimizedct/cgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024070320240625

Test failure


error 142
Alarm clock

Number of similar (implementation,compiler) pairs: 3, namely:
ImplementationCompiler
T:refclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:refgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:refgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Compiler output


picnic2_simulate_mul.c: picnic2_simulate_mul.c:295:57: error: argument to '__builtin_neon_vshrq_n_v' must be a constant integer
picnic2_simulate_mul.c:         out128[i1 / 2] = mm128_xor(mm128_and(t1, mask), mm128_sr_u64(mm128_and(t2, mask), width));
picnic2_simulate_mul.c:                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
picnic2_simulate_mul.c: ./simd.h:109:28: note: expanded from macro 'mm128_sr_u64'
picnic2_simulate_mul.c: #define mm128_sr_u64(x, s) vshrq_n_u64((x), (s))
picnic2_simulate_mul.c:                            ^
picnic2_simulate_mul.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/arm_neon.h:25260:24: note: expanded from macro 'vshrq_n_u64'
picnic2_simulate_mul.c:   __ret = (uint64x2_t) __builtin_neon_vshrq_n_v((int8x16_t)__s0, __p1, 51); \
picnic2_simulate_mul.c:                        ^
picnic2_simulate_mul.c: ./simd.h:103:41: note: expanded from macro 'mm128_xor'
picnic2_simulate_mul.c: #define mm128_xor(l, r) veorq_u64((l), (r))
picnic2_simulate_mul.c:                                         ^
picnic2_simulate_mul.c: picnic2_simulate_mul.c:296:58: error: argument to '__builtin_neon_vshlq_n_v' must be a constant integer
picnic2_simulate_mul.c:         out128[i2 / 2] = mm128_xor(mm128_nand(mask, t2), mm128_sl_u64(mm128_nand(mask, t1), width));
picnic2_simulate_mul.c:                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
picnic2_simulate_mul.c: ./simd.h:108:28: note: expanded from macro 'mm128_sl_u64'
picnic2_simulate_mul.c: #define mm128_sl_u64(x, s) vshlq_n_u64((x), (s))
picnic2_simulate_mul.c:                            ^
picnic2_simulate_mul.c: /usr/lib/llvm-14/lib/clang/14.0.0/include/arm_neon.h:24852:24: note: expanded from macro 'vshlq_n_u64'
picnic2_simulate_mul.c:   __ret = (uint64x2_t) __builtin_neon_vshlq_n_v((int8x16_t)__s0, __p1, 51); \
picnic2_simulate_mul.c:                        ^
picnic2_simulate_mul.c: ./simd.h:103:41: note: expanded from macro 'mm128_xor'
picnic2_simulate_mul.c: #define mm128_xor(l, r) veorq_u64((l), (r))
picnic2_simulate_mul.c:                                         ^
picnic2_simulate_mul.c: 2 errors generated.

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:optimizedct/neonclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


picnic_impl.c: picnic_impl.c: In function 'mpc_LowMC_verify':
picnic_impl.c: picnic_impl.c:469:5: warning: 'mpc_matrix_mul' accessing 24 bytes in a region of size 16 [-Wstringop-overflow=]
picnic_impl.c:   469 |     mpc_matrix_mul(roundKey, keyShares, KMatrix(0, params), params, 2);
picnic_impl.c:       |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
picnic_impl.c: picnic_impl.c:469:5: note: referencing argument 1 of type 'uint32_t **' {aka 'unsigned int **'}
picnic_impl.c: picnic_impl.c:469:5: warning: 'mpc_matrix_mul' accessing 24 bytes in a region of size 16 [-Wstringop-overflow=]
picnic_impl.c: picnic_impl.c:469:5: note: referencing argument 2 of type 'uint32_t **' {aka 'unsigned int **'}
picnic_impl.c: picnic_impl.c:437:6: note: in a call to function 'mpc_matrix_mul'
picnic_impl.c:   437 | void mpc_matrix_mul(uint32_t* output[3], uint32_t* state[3], const uint32_t* matrix,
picnic_impl.c:       |      ^~~~~~~~~~~~~~
picnic_impl.c: picnic_impl.c:470:5: warning: 'mpc_xor' accessing 24 bytes in a region of size 16 [-Wstringop-overflow=]
picnic_impl.c:   470 |     mpc_xor(state, roundKey, params->stateSizeWords, 2);
picnic_impl.c:       |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
picnic_impl.c: picnic_impl.c:470:5: note: referencing argument 1 of type 'uint32_t **' {aka 'unsigned int **'}
picnic_impl.c: picnic_impl.c:470:5: warning: 'mpc_xor' accessing 24 bytes in a region of size 16 [-Wstringop-overflow=]
picnic_impl.c: picnic_impl.c:470:5: note: referencing argument 2 of type 'uint32_t **' {aka 'unsigned int **'}
picnic_impl.c: picnic_impl.c:190:6: note: in a call to function 'mpc_xor'
picnic_impl.c:   190 | void mpc_xor(uint32_t* state[3], uint32_t* in[3], uint32_t len, int players)
picnic_impl.c:       |      ^~~~~~~
picnic_impl.c: picnic_impl.c:473:9: warning: 'mpc_matrix_mul' accessing 24 bytes in a region of size 16 [-Wstringop-overflow=]
picnic_impl.c:   473 |         mpc_matrix_mul(roundKey, keyShares, KMatrix(r, params), params, 2);
picnic_impl.c:       |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
picnic_impl.c: picnic_impl.c:473:9: note: referencing argument 1 of type 'uint32_t **' {aka 'unsigned int **'}
picnic_impl.c: picnic_impl.c:473:9: warning: 'mpc_matrix_mul' accessing 24 bytes in a region of size 16 [-Wstringop-overflow=]
picnic_impl.c: picnic_impl.c:473:9: note: referencing argument 2 of type 'uint32_t **' {aka 'unsigned int **'}
picnic_impl.c: ...

Number of similar (implementation,compiler) pairs: 2, namely:
ImplementationCompiler
T:refgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:refgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)