Implementation notes: armeabi, cubox, crypto_stream/salsa2012

Computer: cubox
Architecture: armeabi
CPU ID: unknown CPU ID
SUPERCOP version: 20161026
Operation: crypto_stream
Primitive: salsa2012
TimeImplementationCompilerBenchmark dateSUPERCOP version
18418e/mergedgcc -funroll-loops -mcpu=marvell-pj4 -O22016121020161026
18529e/mergedgcc -funroll-loops -mcpu=marvell-pj4 -O32016121020161026
18584e/mergedgcc -mcpu=marvell-pj4 -O32016121020161026
18624e/mergedgcc -mcpu=marvell-pj4 -O22016121020161026
21175refgcc -funroll-loops -mcpu=marvell-pj4 -O32016121020161026
21176refgcc -mcpu=marvell-pj4 -O32016121020161026
21267e/refgcc -funroll-loops -mcpu=marvell-pj4 -O32016121020161026
21352e/regsgcc -funroll-loops -mcpu=marvell-pj4 -O32016121020161026
23684e/refgcc -mcpu=marvell-pj4 -O32016121020161026
23696e/regsgcc -mcpu=marvell-pj4 -O32016121020161026
23729e/refgcc -funroll-loops -mcpu=marvell-pj4 -O22016121020161026
23948e/mergedgcc -funroll-loops -mcpu=marvell-pj4 -Os2016121020161026
24751e/regsgcc -funroll-loops -mcpu=marvell-pj4 -O22016121020161026
25484e/mergedgcc -mcpu=marvell-pj4 -Os2016121020161026
27105refgcc -funroll-loops -mcpu=marvell-pj4 -O22016121020161026
29360e/regsgcc -mcpu=marvell-pj4 -O22016121020161026
30480refgcc -mcpu=marvell-pj4 -O22016121020161026
31339e/refgcc -mcpu=marvell-pj4 -O22016121020161026
32380refgcc -funroll-loops -mcpu=marvell-pj4 -Os2016121020161026
32456refgcc -mcpu=marvell-pj4 -Os2016121020161026
35272e/regsgcc -mcpu=marvell-pj4 -Os2016121020161026
35376e/regsgcc -funroll-loops -mcpu=marvell-pj4 -Os2016121020161026
38124e/refgcc -funroll-loops -mcpu=marvell-pj4 -Os2016121020161026
38144e/refgcc -mcpu=marvell-pj4 -Os2016121020161026

Test failure

Implementation: crypto_stream/salsa2012/armneon3
Compiler: gcc -funroll-loops -mcpu=marvell-pj4 -O2
error 111

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -mcpu=marvell-pj4 -O2 armneon3
gcc -funroll-loops -mcpu=marvell-pj4 -O3 armneon3
gcc -funroll-loops -mcpu=marvell-pj4 -Os armneon3
gcc -mcpu=marvell-pj4 -O2 armneon3
gcc -mcpu=marvell-pj4 -O3 armneon3
gcc -mcpu=marvell-pj4 -Os armneon3

Compiler output

Implementation: crypto_stream/salsa2012/armneon2
Compiler: gcc -funroll-loops -mcpu=marvell-pj4 -O2
xor.c: In file included from xor.c:9:0:
xor.c: xor.c: In function 'crypto_stream_salsa2012_armneon2_xor':
xor.c: /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.2.1/include/arm_neon.h:6187:1: error: inlining failed in call to always_inline 'vcombine_u32': target specific option mismatch
xor.c: vcombine_u32 (uint32x2_t __a, uint32x2_t __b)
xor.c: ^~~~~~~~~~~~
xor.c: xor.c:40:14: note: called from here
xor.c: uint32x4_t start1 = vcombine_u32(k5k0,n0k4);
xor.c: ^~~~~~
xor.c: In file included from xor.c:9:0:
xor.c: /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.2.1/include/arm_neon.h:7553:1: error: inlining failed in call to always_inline 'vext_u32': target specific option mismatch
xor.c: ...
xor.c: xor.c:354:3: note: called from here
xor.c: vst1q_u8((uint8_t *) c,(uint8x16_t) x0x1x2x3);
xor.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
xor.c: In file included from xor.c:9:0:
xor.c: /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.2.1/include/arm_neon.h:565:1: error: inlining failed in call to always_inline 'vadd_u64': target specific option mismatch
xor.c: vadd_u64 (uint64x1_t __a, uint64x1_t __b)
xor.c: ^~~~~~~~
xor.c: xor.c:364:23: note: called from here
xor.c: n2n3 = (uint32x2_t) vadd_u64(nextblock,(uint64x1_t) n2n3);
xor.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -mcpu=marvell-pj4 -O2 armneon2
gcc -funroll-loops -mcpu=marvell-pj4 -O3 armneon2
gcc -funroll-loops -mcpu=marvell-pj4 -Os armneon2
gcc -mcpu=marvell-pj4 -O2 armneon2
gcc -mcpu=marvell-pj4 -O3 armneon2
gcc -mcpu=marvell-pj4 -Os armneon2

Compiler output

Implementation: crypto_stream/salsa2012/armneon
Compiler: gcc -funroll-loops -mcpu=marvell-pj4 -O2
xor.c: In file included from xor.c:9:0:
xor.c: xor.c: In function 'crypto_stream_salsa2012_armneon_xor':
xor.c: /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.2.1/include/arm_neon.h:6187:1: error: inlining failed in call to always_inline 'vcombine_u32': target specific option mismatch
xor.c: vcombine_u32 (uint32x2_t __a, uint32x2_t __b)
xor.c: ^~~~~~~~~~~~
xor.c: xor.c:40:14: note: called from here
xor.c: uint32x4_t start1 = vcombine_u32(k5k0,n0k4);
xor.c: ^~~~~~
xor.c: In file included from xor.c:9:0:
xor.c: /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.2.1/include/arm_neon.h:7553:1: error: inlining failed in call to always_inline 'vext_u32': target specific option mismatch
xor.c: ...
xor.c: xor.c:166:3: note: called from here
xor.c: vst1q_u8((uint8_t *) c,(uint8x16_t) x0x1x2x3);
xor.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
xor.c: In file included from xor.c:9:0:
xor.c: /usr/lib/gcc/armv7l-unknown-linux-gnueabihf/6.2.1/include/arm_neon.h:565:1: error: inlining failed in call to always_inline 'vadd_u64': target specific option mismatch
xor.c: vadd_u64 (uint64x1_t __a, uint64x1_t __b)
xor.c: ^~~~~~~~
xor.c: xor.c:176:23: note: called from here
xor.c: n2n3 = (uint32x2_t) vadd_u64(nextblock,(uint64x1_t) n2n3);
xor.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -mcpu=marvell-pj4 -O2 armneon
gcc -funroll-loops -mcpu=marvell-pj4 -O3 armneon
gcc -funroll-loops -mcpu=marvell-pj4 -Os armneon
gcc -mcpu=marvell-pj4 -O2 armneon
gcc -mcpu=marvell-pj4 -O3 armneon
gcc -mcpu=marvell-pj4 -Os armneon