Implementation notes: amd64, devoptimis, crypto_core/invhrss701

Computer: devoptimis
Architecture: amd64
CPU ID: GenuineIntel-000206c2-bfebfbff
SUPERCOP version: 20190910
Operation: crypto_core
Primitive: invhrss701
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
77478036208 0 019216 784 832simplergcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019100320190910
98535504660 0 017574 776 832refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019100320190910
38845517909 0 011597 768 832simplergcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019100320190910
42970265922 0 011677 768 832simplergcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019100320190910
587819661428 0 012149 768 832refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019100320190910
1146038291360 0 012037 768 832refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019100320190910
128379003758 0 010529 752 800simplergcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019100320190910
1579662461257 0 011025 752 800refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019100320190910

Test failure

Implementation: avx2
Security model: unknown
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
error 111

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2

Test failure

Implementation: faster821
Security model: unknown
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
error 111
./try: Symbol `memcpy' causes overflow in R_X86_64_PC32 relocation

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE faster821
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE faster821
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE faster821
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE faster821

Compiler output

Implementation: faster
Security model: unknown
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
core.c: core.c: In function 'r3_recip':
core.c: core.c:353:17: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
core.c: F0[0] = F0[1] = _mm256_set_epi32(-1,-1,-1,-1,-1,-1,-1,-1);
core.c: ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
core.c: core.c: In function 'vec256_swap':
core.c: core.c:171:20: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
core.c: static inline void vec256_swap(vec256 *f,vec256 *g,int len,vec256 mask)
core.c: ^~~~~~~~~~~
core.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
core.c: from core.c:4:
core.c: core.c: In function 'vec256_frombits':
core.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:597:1: error: inlining failed in call to always_inline '_mm256_shuffle_epi32': target specific option mismatch
core.c: _mm256_shuffle_epi32 (__m256i __A, const int __mask)
core.c: ^~~~~~~~~~~~~~~~~~~~
core.c: core.c:56:7: note: called from here
core.c: h = _mm256_shuffle_epi32(h,0xd8);
core.c: ~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
core.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:43:0,
core.c: from core.c:4:
core.c: /usr/lib/gcc/x86_64-linux-gnu/7/include/avx2intrin.h:1068:1: error: inlining failed in call to always_inline '_mm256_permute4x64_epi64': target specific option mismatch
core.c: _mm256_permute4x64_epi64 (__m256i __X, const int __M)
core.c: ^~~~~~~~~~~~~~~~~~~~~~~~
core.c: core.c:55:7: note: called from here
core.c: h = _mm256_permute4x64_epi64(h,0xd8);
core.c: ~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
core.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE faster
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE faster
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE faster
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE faster