Implementation notes: amd64, par, crypto_hash/groestl512

Computer: par
Architecture: amd64
CPU ID: GenuineIntel-000406c3-bfebfbff
SUPERCOP version: 20161026
Operation: crypto_hash
Primitive: groestl512
TimeImplementationCompilerBenchmark dateSUPERCOP version
82360aesnigcc -march=native -mcpu=native -O32016121420161026
82520aesnigcc -funroll-loops -march=native -mcpu=native -O32016121420161026
82640aesnigcc -funroll-loops -march=native -mcpu=native -O22016121420161026
83220aesnigcc -march=native -mcpu=native -O22016121420161026
84320aesnigcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
84320aesnigcc -march=native -mcpu=native -Os2016121420161026
87540aesni-intrgcc -funroll-loops -march=native -mcpu=native -O32016121420161026
87780aesni-intrgcc -funroll-loops -march=native -mcpu=native -O22016121420161026
87780aesni-intrgcc -march=native -mcpu=native -Os2016121420161026
90100aesni-intrgcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
92000opterongcc -funroll-loops -march=native -mcpu=native -O22016121420161026
92540opterongcc -march=native -mcpu=native -O32016121420161026
92820opterongcc -funroll-loops -march=native -mcpu=native -O32016121420161026
93700opterongcc -march=native -mcpu=native -O22016121420161026
94920aesni-intrgcc -march=native -mcpu=native -O32016121420161026
95900opterongcc -march=native -mcpu=native -Os2016121420161026
95980aesni-intrgcc -march=native -mcpu=native -O22016121420161026
96440opterongcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
122620sphlibgcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
123220sphlib-adaptedgcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
124440sphlibgcc -funroll-loops -march=native -mcpu=native -O22016121420161026
124880sphlib-adaptedgcc -funroll-loops -march=native -mcpu=native -O32016121420161026
126140sphlibgcc -funroll-loops -march=native -mcpu=native -O32016121420161026
126780sphlibgcc -march=native -mcpu=native -O32016121420161026
129800sphlib-adaptedgcc -march=native -mcpu=native -O32016121420161026
130860opt64gcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
132820sphlibgcc -march=native -mcpu=native -O22016121420161026
134620core2duogcc -funroll-loops -march=native -mcpu=native -O22016121420161026
134640core2duogcc -funroll-loops -march=native -mcpu=native -O32016121420161026
134820core2duogcc -march=native -mcpu=native -O32016121420161026
136180core2duogcc -march=native -mcpu=native -O22016121420161026
136780sphlib-adaptedgcc -funroll-loops -march=native -mcpu=native -O22016121420161026
136840core2duogcc -march=native -mcpu=native -Os2016121420161026
137140core2duogcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
139260opt64gcc -funroll-loops -march=native -mcpu=native -O32016121420161026
139440sphlib-adaptedgcc -march=native -mcpu=native -O22016121420161026
139620opt64gcc -funroll-loops -march=native -mcpu=native -O22016121420161026
140280opt64gcc -march=native -mcpu=native -O22016121420161026
143200opt64gcc -march=native -mcpu=native -O32016121420161026
160560sphlib-adaptedgcc -march=native -mcpu=native -Os2016121420161026
164440sphlibgcc -march=native -mcpu=native -Os2016121420161026
166860opt64gcc -march=native -mcpu=native -Os2016121420161026
193580sphlib-smallgcc -funroll-loops -march=native -mcpu=native -O32016121420161026
200040sphlib-smallgcc -march=native -mcpu=native -O32016121420161026
200600sphlib-smallgcc -funroll-loops -march=native -mcpu=native -O22016121420161026
202680sphlib-smallgcc -march=native -mcpu=native -O22016121420161026
221840sphlib-smallgcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
228400sphlib-smallgcc -march=native -mcpu=native -Os2016121420161026
230520opt32gcc -funroll-loops -march=native -mcpu=native -O22016121420161026
237460opt32gcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
237680opt32gcc -funroll-loops -march=native -mcpu=native -O32016121420161026
291600mmxgcc -march=native -mcpu=native -O32016121420161026
292100mmxgcc -funroll-loops -march=native -mcpu=native -O32016121420161026
301120mmxgcc -march=native -mcpu=native -O22016121420161026
302180mmxgcc -funroll-loops -march=native -mcpu=native -O22016121420161026
304480mmxgcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
312000mmxgcc -march=native -mcpu=native -Os2016121420161026
332100vperm-intrgcc -march=native -mcpu=native -Os2016121420161026
337380vpermgcc -march=native -mcpu=native -O32016121420161026
337400vpermgcc -funroll-loops -march=native -mcpu=native -O32016121420161026
337980vpermgcc -funroll-loops -march=native -mcpu=native -O22016121420161026
338460vpermgcc -march=native -mcpu=native -O22016121420161026
339340vpermgcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
339540vpermgcc -march=native -mcpu=native -Os2016121420161026
342540vperm-intrgcc -funroll-loops -march=native -mcpu=native -O32016121420161026
342820vperm-intrgcc -funroll-loops -march=native -mcpu=native -O22016121420161026
355560vperm-intrgcc -march=native -mcpu=native -O32016121420161026
356800vperm-intrgcc -march=native -mcpu=native -O22016121420161026
372660vperm-intrgcc -funroll-loops -march=native -mcpu=native -Os2016121420161026
396180opt32gcc -march=native -mcpu=native -O32016121420161026
400120opt32gcc -march=native -mcpu=native -O22016121420161026
419880opt32gcc -march=native -mcpu=native -Os2016121420161026
53968032bit-bytesliced-c-smallgcc -funroll-loops -march=native -mcpu=native -O32016121420161026
54230032bit-bytesliced-c-smallgcc -march=native -mcpu=native -O32016121420161026
59338032bit-bytesliced-c-smallgcc -funroll-loops -march=native -mcpu=native -O22016121420161026
105494032bit-bytesliced-c-smallgcc -march=native -mcpu=native -O22016121420161026
114504032bit-bytesliced-c-smallgcc -march=native -mcpu=native -Os2016121420161026
115614032bit-bytesliced-c-smallgcc -funroll-loops -march=native -mcpu=native -Os2016121420161026

Test failure

Implementation: crypto_hash/groestl512/avx
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
error 111

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 avx
gcc -funroll-loops -march=native -mcpu=native -O3 avx
gcc -funroll-loops -march=native -mcpu=native -Os avx
gcc -march=native -mcpu=native -O2 avx
gcc -march=native -mcpu=native -O3 avx
gcc -march=native -mcpu=native -Os avx

Compiler output

Implementation: crypto_hash/groestl512/sphlib
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
groestl.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
hash.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
measure.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 18, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 sphlib sphlib-adapted sphlib-small
gcc -funroll-loops -march=native -mcpu=native -O3 sphlib sphlib-adapted sphlib-small
gcc -funroll-loops -march=native -mcpu=native -Os sphlib sphlib-adapted sphlib-small
gcc -march=native -mcpu=native -O2 sphlib sphlib-adapted sphlib-small
gcc -march=native -mcpu=native -O3 sphlib sphlib-adapted sphlib-small
gcc -march=native -mcpu=native -Os sphlib sphlib-adapted sphlib-small

Compiler output

Implementation: crypto_hash/groestl512/avx-intr
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
hash.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
hash.c: In file included from hash.c:31:0:
hash.c: groestl-intr-avx.h: In function 'TF1024':
hash.c: groestl-intr-avx.h:906:8: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
hash.c: ymm8 = insert_m128i_in_m256d(ymm8, xmm8, 0);
hash.c: ^
hash.c: In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/immintrin.h:41:0,
hash.c: from groestl-intr-avx.h:12,
hash.c: from hash.c:31:
hash.c: /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/avxintrin.h:1416:1: error: inlining failed in call to always_inline '_mm256_castsi256_pd': target specific option mismatch
hash.c: _mm256_castsi256_pd (__m256i __A)
hash.c: ^~~~~~~~~~~~~~~~~~~
hash.c: In file included from hash.c:31:0:
hash.c: groestl-intr-avx.h:33:47: note: called from here
hash.c: #define insert_m128i_in_m256d(ymm, xmm, pos) (_mm256_castsi256_pd(_mm256_insertf128_si256(_mm256_castpd_si256(ymm), xmm, pos)))
hash.c: ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
hash.c: groestl-intr-avx.h:922:11: note: in expansion of macro 'insert_m128i_in_m256d'
hash.c: ymm15 = insert_m128i_in_m256d(ymm15, xmm7, 1);
hash.c: ^~~~~~~~~~~~~~~~~~~~~
hash.c: In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/immintrin.h:41:0,
hash.c: from groestl-intr-avx.h:12,
hash.c: from hash.c:31:
hash.c: /usr/lib/gcc/x86_64-pc-linux-gnu/6.2.1/include/avxintrin.h:742:1: error: inlining failed in call to always_inline '_mm256_insertf128_si256': target specific option mismatch
hash.c: _mm256_insertf128_si256 (__m256i __X, __m128i __Y, const int __O)
hash.c: ^~~~~~~~~~~~~~~~~~~~~~~
hash.c: ...

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 avx-intr
gcc -funroll-loops -march=native -mcpu=native -O3 avx-intr
gcc -funroll-loops -march=native -mcpu=native -Os avx-intr
gcc -march=native -mcpu=native -O2 avx-intr
gcc -march=native -mcpu=native -O3 avx-intr
gcc -march=native -mcpu=native -Os avx-intr

Compiler output

Implementation: crypto_hash/groestl512/opt64
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
hash.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
hash.c: hash.c:194:14: warning: 'inP' is static but declared in inline function 'F1024' which is not static
hash.c: static u64 inP[COLS1024] __attribute__((aligned(16)));
hash.c: ^~~
hash.c: hash.c:193:14: warning: 'outQ' is static but declared in inline function 'F1024' which is not static
hash.c: static u64 outQ[COLS1024] __attribute__((aligned(16)));
hash.c: ^~~~
hash.c: hash.c:192:14: warning: 'z' is static but declared in inline function 'F1024' which is not static
hash.c: static u64 z[COLS1024] __attribute__((aligned(16)));
hash.c: ^
hash.c: hash.c:191:14: warning: 'y' is static but declared in inline function 'F1024' which is not static
hash.c: static u64 y[COLS1024] __attribute__((aligned(16)));
hash.c: ^
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
measure.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 opt64
gcc -funroll-loops -march=native -mcpu=native -O3 opt64
gcc -funroll-loops -march=native -mcpu=native -Os opt64
gcc -march=native -mcpu=native -O2 opt64
gcc -march=native -mcpu=native -O3 opt64
gcc -march=native -mcpu=native -Os opt64

Compiler output

Implementation: crypto_hash/groestl512/avx
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
hash.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 avx
gcc -funroll-loops -march=native -mcpu=native -O3 avx
gcc -funroll-loops -march=native -mcpu=native -Os avx
gcc -march=native -mcpu=native -O2 avx
gcc -march=native -mcpu=native -O3 avx
gcc -march=native -mcpu=native -Os avx

Compiler output

Implementation: crypto_hash/groestl512/32bit-bytesliced-c-small
Compiler: gcc -funroll-loops -march=native -mcpu=native -O2
hash.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
try.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead
measure.c: gcc: warning: '-mcpu=' is deprecated; use '-mtune=' or '-march=' instead

Number of similar (compiler,implementation) pairs: 54, namely:
CompilerImplementations
gcc -funroll-loops -march=native -mcpu=native -O2 32bit-bytesliced-c-small aesni aesni-intr core2duo mmx opt32 opteron vperm vperm-intr
gcc -funroll-loops -march=native -mcpu=native -O3 32bit-bytesliced-c-small aesni aesni-intr core2duo mmx opt32 opteron vperm vperm-intr
gcc -funroll-loops -march=native -mcpu=native -Os 32bit-bytesliced-c-small aesni aesni-intr core2duo mmx opt32 opteron vperm vperm-intr
gcc -march=native -mcpu=native -O2 32bit-bytesliced-c-small aesni aesni-intr core2duo mmx opt32 opteron vperm vperm-intr
gcc -march=native -mcpu=native -O3 32bit-bytesliced-c-small aesni aesni-intr core2duo mmx opt32 opteron vperm vperm-intr
gcc -march=native -mcpu=native -Os 32bit-bytesliced-c-small aesni aesni-intr core2duo mmx opt32 opteron vperm vperm-intr