Implementation notes: armeabi, cubie2, crypto_hash/groestl256

Computer: cubie2
Architecture: armeabi
CPU ID: unknown CPU ID
SUPERCOP version: 20161026
Operation: crypto_hash
Primitive: groestl256
TimeImplementationCompilerBenchmark dateSUPERCOP version
147731neon-tablegcc -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
147835neon-tablegcc -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
147853neon-tablegcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
148873neon-tablegcc -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
149626neon-tablegcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
149904neon-tablegcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
166794neon-bitslicegcc -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
166878neon-bitslicegcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
167007neon-bitslicegcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
167328neon-bitslicegcc -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
167392neon-bitslicegcc -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
167447neon-bitslicegcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
176889arm11gcc -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
176898arm11gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
179089arm11gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
182109arm11gcc -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
182385arm11gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
183626arm11gcc -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
211124arm32gcc -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
213349arm32gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
219483arm32gcc -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
243684opt32gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
244230opt32gcc -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
244788opt32gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
245637opt32gcc -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
317387sphlibgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
329677sphlibgcc -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
332779sphlib-adaptedgcc -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
332783sphlib-smallgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
335650sphlib-adaptedgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
36396032bit-2ktablegcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
367662sphlib-adaptedgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
36797732bit-2ktablegcc -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
371357sphlib-adaptedgcc -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
37224632bit-2ktablegcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
37507532bit-2ktablegcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
376053sphlib-smallgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
37860632bit-2ktablegcc -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
378925sphlib-smallgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
385881sphlib-smallgcc -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
386127sphlib-smallgcc -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
396245sphlib-adaptedgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
397657sphlib-adaptedgcc -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
400246sphlib-smallgcc -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
434821sphlibgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
437602sphlibgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
43853332bit-2ktablegcc -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
439042sphlibgcc -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
444420sphlibgcc -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
448228opt64gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
485520opt32gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
487164opt32gcc -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
51452932bit-bytesliced-c-fastgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
523545opt64gcc -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
56996432bit-bytesliced-c-fastgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
63887332bit-bytesliced-c-smallgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
64019532bit-bytesliced-c-fastgcc -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
6530798bit_cgcc -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
6615428bit_cgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
6713958bit_cgcc -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
6774538bit_cgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
68870532bit-bytesliced-c-smallgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
69214332bit-bytesliced-c-smallgcc -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
72797432bit-bytesliced-c-fastgcc -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
7425898bit_cgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
75340832bit-bytesliced-c-fastgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
785470opt64gcc -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
7962298bit_cgcc -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
82601732bit-bytesliced-c-fastgcc -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
1002543opt64gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
1012701opt64gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026
1035207opt64gcc -mcpu=native -mfpu=neon-vfpv4 -O32016121520161026
122941932bit-bytesliced-c-smallgcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
125826732bit-bytesliced-c-smallgcc -mcpu=native -mfpu=neon-vfpv4 -Os2016121520161026
127389032bit-bytesliced-c-smallgcc -mcpu=native -mfpu=neon-vfpv4 -O22016121520161026

Checksum failure

Implementation: crypto_hash/groestl256/arm32
Compiler: gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2
f079b87636261cf3c9ea6c0c0fa5429569bc7bd103f8d0f0bb23bd4ba5d49053
Number of similar (compiler,implementation) pairs: 3, namely:
CompilerImplementations
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 arm32
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 arm32
gcc -mcpu=native -mfpu=neon-vfpv4 -O3 arm32

Test failure

Implementation: crypto_hash/groestl256/thumb-asm-fast
Compiler: gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2
error 142
sh: line 1: 9352 Alarm clock killafter 3600 ./try

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 thumb-asm-fast

Test failure

Implementation: crypto_hash/groestl256/thumb-asm-small
Compiler: gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2
error 142
sh: line 1: 9647 Alarm clock killafter 3600 ./try

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 thumb-asm-small

Test failure

Implementation: crypto_hash/groestl256/thumb-asm-fast
Compiler: gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3
error 111

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 thumb-asm-fast thumb-asm-small
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os thumb-asm-small
gcc -mcpu=native -mfpu=neon-vfpv4 -O2 thumb-asm-fast thumb-asm-small
gcc -mcpu=native -mfpu=neon-vfpv4 -O3 thumb-asm-fast thumb-asm-small
gcc -mcpu=native -mfpu=neon-vfpv4 -Os thumb-asm-small

Test failure

Implementation: crypto_hash/groestl256/thumb-asm-fast
Compiler: gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os
error 142
sh: line 1: 9390 Alarm clock killafter 3600 ./try

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os thumb-asm-fast

Test failure

Implementation: crypto_hash/groestl256/thumb-asm-fast
Compiler: gcc -mcpu=native -mfpu=neon-vfpv4 -Os
error 142
sh: line 1: 9267 Alarm clock killafter 3600 ./try

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
gcc -mcpu=native -mfpu=neon-vfpv4 -Os thumb-asm-fast

Compiler output

Implementation: crypto_hash/groestl256/vperm-intr
Compiler: gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2
hash.c: In file included from hash.c:34:0:
hash.c: groestl-intr-vperm.h:13:23: fatal error: tmmintrin.h: No such file or directory
hash.c: #include gt;
hash.c: ^
hash.c: compilation terminated.

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 vperm-intr
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 vperm-intr
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os vperm-intr
gcc -mcpu=native -mfpu=neon-vfpv4 -O2 vperm-intr
gcc -mcpu=native -mfpu=neon-vfpv4 -O3 vperm-intr
gcc -mcpu=native -mfpu=neon-vfpv4 -Os vperm-intr

Compiler output

Implementation: crypto_hash/groestl256/neon-bitslice
Compiler: gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2
hash.c: hash.c: In function 'crypto_hash_groestl256_neon_bitslice':
hash.c: hash.c:40:12: warning: iteration 64 invokes undefined behavior [-Waggressive-loop-optimizations]
hash.c: ctx[i] = 0;
hash.c: ~~~~~~~^~~
hash.c: hash.c:39:3: note: within this loop
hash.c: for(i=0;i hash.c: ^~~

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 neon-bitslice
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 neon-bitslice
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os neon-bitslice
gcc -mcpu=native -mfpu=neon-vfpv4 -O2 neon-bitslice
gcc -mcpu=native -mfpu=neon-vfpv4 -O3 neon-bitslice
gcc -mcpu=native -mfpu=neon-vfpv4 -Os neon-bitslice

Compiler output

Implementation: crypto_hash/groestl256/neon-vperm
Compiler: gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2
hash.c: hash.c: In function 'crypto_hash_groestl256_neon_vperm':
hash.c: hash.c:38:12: warning: iteration 64 invokes undefined behavior [-Waggressive-loop-optimizations]
hash.c: ctx[i] = 0;
hash.c: ~~~~~~~^~~
hash.c: hash.c:37:3: note: within this loop
hash.c: for(i=0;i hash.c: ^~~
vperm-neon.S: vperm-neon.S: Assembler messages:
vperm-neon.S: vperm-neon.S:911: Error: expected symbol name
vperm-neon.S: vperm-neon.S:922: Error: expected symbol name

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 neon-vperm
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 neon-vperm
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os neon-vperm
gcc -mcpu=native -mfpu=neon-vfpv4 -O2 neon-vperm
gcc -mcpu=native -mfpu=neon-vfpv4 -O3 neon-vperm
gcc -mcpu=native -mfpu=neon-vfpv4 -Os neon-vperm

Compiler output

Implementation: crypto_hash/groestl256/opt64
Compiler: gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2
hash.c: hash.c:194:14: warning: 'inP' is static but declared in inline function 'F1024' which is not static
hash.c: static u64 inP[COLS1024] __attribute__((aligned(16)));
hash.c: ^~~
hash.c: hash.c:193:14: warning: 'outQ' is static but declared in inline function 'F1024' which is not static
hash.c: static u64 outQ[COLS1024] __attribute__((aligned(16)));
hash.c: ^~~~
hash.c: hash.c:192:14: warning: 'z' is static but declared in inline function 'F1024' which is not static
hash.c: static u64 z[COLS1024] __attribute__((aligned(16)));
hash.c: ^
hash.c: hash.c:191:14: warning: 'y' is static but declared in inline function 'F1024' which is not static
hash.c: static u64 y[COLS1024] __attribute__((aligned(16)));
hash.c: ^

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 opt64
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 opt64
gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os opt64
gcc -mcpu=native -mfpu=neon-vfpv4 -O2 opt64
gcc -mcpu=native -mfpu=neon-vfpv4 -O3 opt64
gcc -mcpu=native -mfpu=neon-vfpv4 -Os opt64