Implementation notes: x86, thoth, crypto_hash/luffa256

Computer: thoth
Architecture: x86
CPU ID: AuthenticAMD-00000622-0183f9ff
SUPERCOP version: 20160806
Operation: crypto_hash
Primitive: luffa256
TimeImplementationCompilerBenchmark dateSUPERCOP version
50746thomaz/basicgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016072620160724
51940sphlibgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016072620160724
55121thomaz/basicclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016072620160724
55346sphlibgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016072620160724
55434armgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016072620160724
56947opt32gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2016072620160724
59092sphlibgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016072620160724
59272armgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016072620160724
62082sphlibgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016072620160724
63304opt32clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016072620160724
64925opt32gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016072620160724
65096opt32gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016072620160724
65926armgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016072620160724
67649armgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016072620160724
67928armclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016072620160724
69242sphlibclang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments2016072620160724
70848opt32gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016072620160724
150266thomaz/basicgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2016072620160724
221023thomaz/basicgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2016072620160724
243411thomaz/basicgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2016072620160724

Test failure

Implementation: crypto_hash/luffa256/asm-PS-v2-FP
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
error 111

Number of similar (compiler,implementation) pairs: 12, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv asm-PS-v2-FP sse2_x86asm sse2_x86asm-2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv asm-PS-v2-FP sse2_x86asm sse2_x86asm-2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv asm-PS-v2-FP sse2_x86asm sse2_x86asm-2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv asm-PS-v2-FP sse2_x86asm sse2_x86asm-2

Compiler output

Implementation: crypto_hash/luffa256/asm-PS-v2-FP
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
luffa_256.s: gt;:1:8: error: unknown token in expression
luffa_256.s: pshufb %xmm6, maskShufLittleEndian
luffa_256.s: ^
luffa_256.s: gt;:1:1: note: while in macro instantiation
luffa_256.s: mPSSTEPI %xmm6, %xmm7, maskShufLittleEndian
luffa_256.s: ^
luffa_256.s: luffa_256.s:244:2: note: while in macro instantiation
luffa_256.s: mPS %xmm6, %xmm7, %xmm0, %xmm1, %xmm2, %xmm3
luffa_256.s: ^
luffa_256.s: gt;:1:8: error: unknown token in expression
luffa_256.s: pshufb %xmm6, maskShufLittleEndian
luffa_256.s: ^
luffa_256.s: gt;:1:1: note: while in macro instantiation
luffa_256.s: mPSSTEPI %xmm6, %xmm7, maskShufLittleEndian
luffa_256.s: ^
luffa_256.s: luffa_256.s:244:2: note: while in macro instantiation
luffa_256.s: mPS %xmm6, %xmm7, %xmm0, %xmm1, %xmm2, %xmm3
luffa_256.s: ^
luffa_256.s: gt;:2:9: error: unknown token in expression
luffa_256.s: pshufb %xmm7, maskShufLittleEndian
luffa_256.s: ^
luffa_256.s: gt;:1:1: note: while in macro instantiation
luffa_256.s: mPSSTEPI %xmm6, %xmm7, maskShufLittleEndian
luffa_256.s: ^
luffa_256.s: luffa_256.s:244:2: note: while in macro instantiation
luffa_256.s: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments asm-PS-v2-FP

Compiler output

Implementation: crypto_hash/luffa256/sse2
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
luffa_for_sse2.c: luffa_for_sse2.c:303:11: error: always_inline function '_mm_set_epi32' requires target feature 'sse2', but would be inlined into function 'Init' that is compiled without support for 'sse2'
luffa_for_sse2.c: MASK= _mm_set_epi32(0x00000000, 0x00000000, 0x00000000, 0xffffffff);
luffa_for_sse2.c: ^
luffa_for_sse2.c: luffa_for_sse2.c:306:14: error: always_inline function '_mm_set_epi32' requires target feature 'sse2', but would be inlined into function 'Init' that is compiled without support for 'sse2'
luffa_for_sse2.c: ALLONE = _mm_set_epi32(0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff);
luffa_for_sse2.c: ^
luffa_for_sse2.c: luffa_for_sse2.c:310:21: error: always_inline function '_mm_loadu_si128' requires target feature 'sse2', but would be inlined into function 'Init' that is compiled without support for 'sse2'
luffa_for_sse2.c: CNS128[i] = _mm_loadu_si128((__m128i*)&CNS_INIT[i*4]);
luffa_for_sse2.c: ^
luffa_for_sse2.c: luffa_for_sse2.c:316:36: error: always_inline function '_mm_loadu_si128' requires target feature 'sse2', but would be inlined into function 'Init' that is compiled without support for 'sse2'
luffa_for_sse2.c: state->gt;chainv[i] = _mm_loadu_si128((__m128i*)&IV[i*4]);
luffa_for_sse2.c: ^
luffa_for_sse2.c: luffa_for_sse2.c:322:36: error: always_inline function '_mm_loadu_si128' requires target feature 'sse2', but would be inlined into function 'Init' that is compiled without support for 'sse2'
luffa_for_sse2.c: state->gt;chainv[i] = _mm_loadu_si128((__m128i*)&IV[i*4]);
luffa_for_sse2.c: ^
luffa_for_sse2.c: luffa_for_sse2.c:328:36: error: always_inline function '_mm_loadu_si128' requires target feature 'sse2', but would be inlined into function 'Init' that is compiled without support for 'sse2'
luffa_for_sse2.c: state->gt;chainv[i] = _mm_loadu_si128((__m128i*)&IV[i*4]);
luffa_for_sse2.c: ^
luffa_for_sse2.c: luffa_for_sse2.c:334:36: error: always_inline function '_mm_loadu_si128' requires target feature 'sse2', but would be inlined into function 'Init' that is compiled without support for 'sse2'
luffa_for_sse2.c: state->gt;chainv[i] = _mm_loadu_si128((__m128i*)&IV[i*4]);
luffa_for_sse2.c: ^
luffa_for_sse2.c: 7 errors generated.

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments sse2

Compiler output

Implementation: crypto_hash/luffa256/sse2_x86asm
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
luffa_x86asm.s: luffa_x86asm.s:808:13: error: unknown token in expression
luffa_x86asm.s: mov %ecx, [%esp+4]
luffa_x86asm.s: ^
luffa_x86asm.s: luffa_x86asm.s:808:13: error: unknown token in expression
luffa_x86asm.s: mov %ecx, [%esp+4]
luffa_x86asm.s: ^
luffa_x86asm.s: luffa_x86asm.s:809:13: error: unknown token in expression
luffa_x86asm.s: movaps %xmm0, [IV ]
luffa_x86asm.s: ^
luffa_x86asm.s: luffa_x86asm.s:809:13: error: unknown token in expression
luffa_x86asm.s: movaps %xmm0, [IV ]
luffa_x86asm.s: ^
luffa_x86asm.s: luffa_x86asm.s:810:13: error: unknown token in expression
luffa_x86asm.s: movaps %xmm1, [IV+16]
luffa_x86asm.s: ^
luffa_x86asm.s: luffa_x86asm.s:810:13: error: unknown token in expression
luffa_x86asm.s: movaps %xmm1, [IV+16]
luffa_x86asm.s: ^
luffa_x86asm.s: luffa_x86asm.s:811:13: error: unknown token in expression
luffa_x86asm.s: movaps %xmm2, [IV+32]
luffa_x86asm.s: ^
luffa_x86asm.s: luffa_x86asm.s:811:13: error: unknown token in expression
luffa_x86asm.s: movaps %xmm2, [IV+32]
luffa_x86asm.s: ^
luffa_x86asm.s: luffa_x86asm.s:812:13: error: unknown token in expression
luffa_x86asm.s: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments sse2_x86asm

Compiler output

Implementation: crypto_hash/luffa256/sse2_x86asm-2
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
luffa_x86asm.s: luffa_x86asm.s:808:13: error: unknown token in expression
luffa_x86asm.s: mov %ecx, [%esp+4]
luffa_x86asm.s: ^
luffa_x86asm.s: luffa_x86asm.s:808:13: error: unknown token in expression
luffa_x86asm.s: mov %ecx, [%esp+4]
luffa_x86asm.s: ^
luffa_x86asm.s: luffa_x86asm.s:809:13: error: unknown token in expression
luffa_x86asm.s: movdqa %xmm0, [IV ]
luffa_x86asm.s: ^
luffa_x86asm.s: luffa_x86asm.s:809:13: error: unknown token in expression
luffa_x86asm.s: movdqa %xmm0, [IV ]
luffa_x86asm.s: ^
luffa_x86asm.s: luffa_x86asm.s:810:13: error: unknown token in expression
luffa_x86asm.s: movdqa %xmm1, [IV+16]
luffa_x86asm.s: ^
luffa_x86asm.s: luffa_x86asm.s:810:13: error: unknown token in expression
luffa_x86asm.s: movdqa %xmm1, [IV+16]
luffa_x86asm.s: ^
luffa_x86asm.s: luffa_x86asm.s:811:13: error: unknown token in expression
luffa_x86asm.s: movdqa %xmm2, [IV+32]
luffa_x86asm.s: ^
luffa_x86asm.s: luffa_x86asm.s:811:13: error: unknown token in expression
luffa_x86asm.s: movdqa %xmm2, [IV+32]
luffa_x86asm.s: ^
luffa_x86asm.s: luffa_x86asm.s:812:13: error: unknown token in expression
luffa_x86asm.s: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments sse2_x86asm-2

Compiler output

Implementation: crypto_hash/luffa256/sse2
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv
luffa_for_sse2.c: luffa_for_sse2.c: In function 'Init':
luffa_for_sse2.c: luffa_for_sse2.c:303:9: warning: SSE vector return without SSE enabled changes the ABI [-Wpsabi]
luffa_for_sse2.c: MASK= _mm_set_epi32(0x00000000, 0x00000000, 0x00000000, 0xffffffff);
luffa_for_sse2.c: ^
luffa_for_sse2.c: In file included from luffa_for_sse2.c:22:0:
luffa_for_sse2.c: luffa_for_sse2.c: In function 'rnd256':
luffa_for_sse2.c: /usr/lib/gcc/i686-linux-gnu/5/include/emmintrin.h:1415:1: error: inlining failed in call to always_inline '_mm_shuffle_epi32': target specific option mismatch
luffa_for_sse2.c: _mm_shuffle_epi32 (__m128i __A, const int __mask)
luffa_for_sse2.c: ^
luffa_for_sse2.c: luffa_for_sse2.c:616:14: error: called from here
luffa_for_sse2.c: msg[1] = _mm_shuffle_epi32(msg[1], 27);
luffa_for_sse2.c: ^
luffa_for_sse2.c: In file included from luffa_for_sse2.c:22:0:
luffa_for_sse2.c: /usr/lib/gcc/i686-linux-gnu/5/include/emmintrin.h:1415:1: error: inlining failed in call to always_inline '_mm_shuffle_epi32': target specific option mismatch
luffa_for_sse2.c: _mm_shuffle_epi32 (__m128i __A, const int __mask)
luffa_for_sse2.c: ^
luffa_for_sse2.c: luffa_for_sse2.c:615:14: error: called from here
luffa_for_sse2.c: msg[0] = _mm_shuffle_epi32(msg[0], 27);
luffa_for_sse2.c: ^
luffa_for_sse2.c: In file included from luffa_for_sse2.c:22:0:
luffa_for_sse2.c: /usr/lib/gcc/i686-linux-gnu/5/include/emmintrin.h:696:1: error: inlining failed in call to always_inline '_mm_loadu_si128': target specific option mismatch
luffa_for_sse2.c: _mm_loadu_si128 (__m128i const *__P)
luffa_for_sse2.c: ^
luffa_for_sse2.c: luffa_for_sse2.c:614:14: error: called from here
luffa_for_sse2.c: msg[1] = _mm_loadu_si128((__m128i*)&state->gt;buffer[4]);
luffa_for_sse2.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv sse2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv sse2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv sse2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv sse2