Implementation notes: amd64, oki, crypto_dh/k298

Computer: oki
Architecture: amd64
CPU ID: GenuineIntel-00050654-bfebfbff
SUPERCOP version: 20181123
Operation: crypto_dh
Primitive: k298
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
221786? ? ?? ? ?refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv2019011020181123
221830? ? ?? ? ?refgcc_-m64_-march=native_-mtune=native_-O3_-fomit-frame-pointer2019011020181123
224982? ? ?? ? ?refgcc_-m64_-march=core-avx-i_-O3_-fomit-frame-pointer2019011020181123
225070? ? ?? ? ?refgcc_-m64_-march=corei7-avx_-O3_-fomit-frame-pointer2019011020181123
226448? ? ?? ? ?refgcc_-m64_-march=core-avx2_-O3_-fomit-frame-pointer2019011020181123
227438? ? ?? ? ?refgcc_-m64_-march=native_-mtune=native_-O2_-fomit-frame-pointer2019011020181123
229168? ? ?? ? ?refgcc_-m64_-march=native_-mtune=native_-Os_-fomit-frame-pointer2019011020181123
229830? ? ?? ? ?refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv2019011020181123
231550? ? ?? ? ?refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv2019011020181123
234600? ? ?? ? ?refgcc_-m64_-march=core-avx-i_-O2_-fomit-frame-pointer2019011020181123
234638? ? ?? ? ?refgcc_-m64_-march=corei7-avx_-O2_-fomit-frame-pointer2019011020181123
235724? ? ?? ? ?refgcc_-m64_-march=core-avx2_-O2_-fomit-frame-pointer2019011020181123
237212? ? ?? ? ?refgcc_-m64_-march=core-avx-i_-Os_-fomit-frame-pointer2019011020181123
237220? ? ?? ? ?refgcc_-m64_-march=corei7-avx_-Os_-fomit-frame-pointer2019011020181123
237788? ? ?? ? ?refgcc_-m64_-march=core-avx2_-Os_-fomit-frame-pointer2019011020181123

Checksum failure

Implementation: ref
Security model: unknown
Compiler: gcc -m64 -march=core-avx-i -O -fomit-frame-pointer
f915491b2bfd72afef772d66e5d93bd26fcf7cf7bf7513d13ee8419b0f467064
Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
gcc -m64 -march=core-avx-i -O -fomit-frame-pointer ref
gcc -m64 -march=core-avx2 -O -fomit-frame-pointer ref
gcc -m64 -march=corei7-avx -O -fomit-frame-pointer ref
gcc -m64 -march=native -mtune=native -O -fomit-frame-pointer ref
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv ref

Test failure

Implementation: ref
Security model: unknown
Compiler: clang -O3 -fwrapv -march=x86-64 -mcpu=core-avx2 -mavx2 -maes -mpclmul -fomit-frame-pointer -Qunused-arguments
error 111

Number of similar (compiler,implementation) pairs: 2, namely:
CompilerImplementations
clang -O3 -fwrapv -march=x86-64 -mcpu=core-avx2 -mavx2 -maes -mpclmul -fomit-frame-pointer -Qunused-arguments ref
clang -O3 -fwrapv -mavx -maes -mpclmul -fomit-frame-pointer -Qunused-arguments ref

Compiler output

Implementation: ref
Security model: unknown
Compiler: cc
dh.c: In file included from dh.c:6:0:
dh.c: ffa.h: In function 'ffa_red_149':
dh.c: ffa.h:18:10: error: incompatible types when assigning to type '__m128i' from type 'int'
dh.c: tp_2 = _mm_clmulepi64_si128(p_149_0, tp_0, 0x00);\
dh.c: ^
dh.c: ffa.h:47:5: note: in expansion of macro 'ffa_red_149_stp'
dh.c: ffa_red_149_stp(a_00, a_01, tp_0, tp_1, tp_2, p_149_0, p_149_1);
dh.c: ^
dh.c: ffa.h:19:10: error: incompatible types when assigning to type '__m128i' from type 'int'
dh.c: tp_1 = _mm_clmulepi64_si128(p_149_0, tp_0, 0x01);\
dh.c: ^
dh.c: ffa.h:47:5: note: in expansion of macro 'ffa_red_149_stp'
dh.c: ffa_red_149_stp(a_00, a_01, tp_0, tp_1, tp_2, p_149_0, p_149_1);
dh.c: ^
dh.c: ffa.h:20:10: error: incompatible types when assigning to type '__m128i' from type 'int'
dh.c: tp_0 = _mm_clmulepi64_si128(p_149_1, tp_0, 0x00);\
dh.c: ^
dh.c: ffa.h:47:5: note: in expansion of macro 'ffa_red_149_stp'
dh.c: ffa_red_149_stp(a_00, a_01, tp_0, tp_1, tp_2, p_149_0, p_149_1);
dh.c: ^
dh.c: ffa.h:18:10: error: incompatible types when assigning to type '__m128i' from type 'int'
dh.c: tp_2 = _mm_clmulepi64_si128(p_149_0, tp_0, 0x00);\
dh.c: ^
dh.c: ffa.h:48:5: note: in expansion of macro 'ffa_red_149_stp'
dh.c: ffa_red_149_stp(a_00, a_01, tp_0, tp_1, tp_2, p_149_0, p_149_1);
dh.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
cc ref

Compiler output

Implementation: ref
Security model: unknown
Compiler: clang -O3 -fomit-frame-pointer -Qunused-arguments
dh.c: In file included from dh.c:6:
dh.c: ./ffa.h:47:5: error: '__builtin_ia32_pclmulqdq128' needs target feature pclmul
dh.c: ffa_red_149_stp(a_00, a_01, tp_0, tp_1, tp_2, p_149_0, p_149_1);
dh.c: ^
dh.c: ./ffa.h:18:12: note: expanded from macro 'ffa_red_149_stp'
dh.c: tp_2 = _mm_clmulepi64_si128(p_149_0, tp_0, 0x00);\
dh.c: ^
dh.c: /usr/bin/../lib64/clang/3.8.0/include/__wmmintrin_pclmul.h:27:13: note: expanded from macro '_mm_clmulepi64_si128'
dh.c: ((__m128i)__builtin_ia32_pclmulqdq128((__v2di)(__m128i)(__X), \
dh.c: ^
dh.c: In file included from dh.c:6:
dh.c: ./ffa.h:47:5: error: '__builtin_ia32_pclmulqdq128' needs target feature pclmul
dh.c: ./ffa.h:19:12: note: expanded from macro 'ffa_red_149_stp'
dh.c: tp_1 = _mm_clmulepi64_si128(p_149_0, tp_0, 0x01);\
dh.c: ^
dh.c: /usr/bin/../lib64/clang/3.8.0/include/__wmmintrin_pclmul.h:27:13: note: expanded from macro '_mm_clmulepi64_si128'
dh.c: ((__m128i)__builtin_ia32_pclmulqdq128((__v2di)(__m128i)(__X), \
dh.c: ^
dh.c: In file included from dh.c:6:
dh.c: ./ffa.h:47:5: error: '__builtin_ia32_pclmulqdq128' needs target feature pclmul
dh.c: ./ffa.h:20:12: note: expanded from macro 'ffa_red_149_stp'
dh.c: tp_0 = _mm_clmulepi64_si128(p_149_1, tp_0, 0x00);\
dh.c: ^
dh.c: /usr/bin/../lib64/clang/3.8.0/include/__wmmintrin_pclmul.h:27:13: note: expanded from macro '_mm_clmulepi64_si128'
dh.c: ((__m128i)__builtin_ia32_pclmulqdq128((__v2di)(__m128i)(__X), \
dh.c: ...

Number of similar (compiler,implementation) pairs: 6, namely:
CompilerImplementations
clang -O3 -fomit-frame-pointer -Qunused-arguments ref
clang -O3 -fwrapv -mavx2 -fomit-frame-pointer -Qunused-arguments ref
clang -O3 -fwrapv -mavx -fomit-frame-pointer -Qunused-arguments ref
clang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments ref
clang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments ref
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments ref

Compiler output

Implementation: ref
Security model: unknown
Compiler: clang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments
try.c: fatal error: error in backend: Cannot select: 0x162d4d0: v4i64 = X86ISD::VTRUNC 0x162d300
try.c: 0x162d300: v16i32 = vselect 0x1621040, 0x15d8a40, 0x162d1d0
try.c: 0x1621040: v4i1 = X86ISD::PCMPGTM 0x16155c0, 0x15d97f0
try.c: 0x16155c0: v4i64 = X86ISD::VBROADCAST 0x15d9b80
try.c: 0x15d9b80: i64,ch = load<LD8[%uglygep72]> 0x1510d70, 0x160be90, undef:i64
try.c: 0x160be90: i64 = add 0x16164a0, 0x1592a30
try.c: 0x16164a0: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[64 x i64]* @cycles> 0
try.c: 0x160d230: i64 = TargetGlobalAddress<[64 x i64]* @cycles> 0
try.c: 0x1592a30: i64 = shl 0x160c9e0, Constant:i8<3>
try.c: 0x160c9e0: i64,ch = CopyFromReg 0x1510d70, Register:i64 %vreg50
try.c: 0x1616960: i64 = Register %vreg50
try.c: 0x15b6bf0: i8 = Constant<3>
try.c: 0x1613c30: i64 = undef
try.c: 0x15d97f0: v4i64,ch = CopyFromReg 0x1510d70, Register:v4i64 %vreg13
try.c: 0x15b9a00: v4i64 = Register %vreg13
try.c: 0x15d8a40: v16i32 = X86ISD::VBROADCAST 0x160cc40
try.c: 0x160cc40: i32,ch = load<LD4[ConstantPool]> 0x1510d70, 0x162ce40, undef:i64
try.c: 0x162ce40: i64 = X86ISD::Wrapper TargetConstantPool:i64<i32 1> 0
try.c: 0x15b0ea0: i64 = TargetConstantPool<i32 1> 0
try.c: 0x1613c30: i64 = undef
try.c: 0x162d1d0: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x162d0a0: i32 = Constant<0>
try.c: 0x162d0a0: i32 = Constant<0>
try.c: 0x162d0a0: i32 = Constant<0>
try.c: 0x162d0a0: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments ref

Compiler output

Implementation: ref
Security model: unknown
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments
try.c: fatal error: error in backend: Cannot select: 0x2039f40: v4i64 = X86ISD::VTRUNC 0x2039d70
try.c: 0x2039d70: v16i32 = vselect 0x202a050, 0x1fcb9c0, 0x2039c40
try.c: 0x202a050: v4i1 = X86ISD::PCMPGTM 0x2022430, 0x1fcc770
try.c: 0x2022430: v4i64 = X86ISD::VBROADCAST 0x1fccb00
try.c: 0x1fccb00: i64,ch = load<LD8[%uglygep72]> 0x1f1bd70, 0x2008970, undef:i64
try.c: 0x2008970: i64 = add 0x2023310, 0x1fbb000
try.c: 0x2023310: i64 = X86ISD::Wrapper TargetGlobalAddress:i64<[64 x i64]* @cycles> 0
try.c: 0x20176c0: i64 = TargetGlobalAddress<[64 x i64]* @cycles> 0
try.c: 0x1fbb000: i64 = shl 0x2016e70, Constant:i8<3>
try.c: 0x2016e70: i64,ch = CopyFromReg 0x1f1bd70, Register:i64 %vreg50
try.c: 0x20237d0: i64 = Register %vreg50
try.c: 0x1fd0ca0: i8 = Constant<3>
try.c: 0x2020aa0: i64 = undef
try.c: 0x1fcc770: v4i64,ch = CopyFromReg 0x1f1bd70, Register:v4i64 %vreg13
try.c: 0x1fdb690: v4i64 = Register %vreg13
try.c: 0x1fcb9c0: v16i32 = X86ISD::VBROADCAST 0x20170d0
try.c: 0x20170d0: i32,ch = load<LD4[ConstantPool]> 0x1f1bd70, 0x20398b0, undef:i64
try.c: 0x20398b0: i64 = X86ISD::Wrapper TargetConstantPool:i64<i32 1> 0
try.c: 0x1fce530: i64 = TargetConstantPool<i32 1> 0
try.c: 0x2020aa0: i64 = undef
try.c: 0x2039c40: v16i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0>
try.c: 0x2039b10: i32 = Constant<0>
try.c: 0x2039b10: i32 = Constant<0>
try.c: 0x2039b10: i32 = Constant<0>
try.c: 0x2039b10: i32 = Constant<0>
try.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments ref

Compiler output

Implementation: ref
Security model: unknown
Compiler: gcc
dh.c: In file included from /usr/lib64/gcc/x86_64-suse-linux/7/include/x86intrin.h:43:0,
dh.c: from lib.h:2,
dh.c: from dh.c:2:
dh.c: smu.h: In function 'smu_3nf_ltr':
dh.c: /usr/lib64/gcc/x86_64-suse-linux/7/include/smmintrin.h:268:1: error: inlining failed in call to always_inline '_mm_cmpeq_epi64': target specific option mismatch
dh.c: _mm_cmpeq_epi64 (__m128i __X, __m128i __Y)
dh.c: ^~~~~~~~~~~~~~~
dh.c: In file included from dh.c:8:0:
dh.c: smu.h:337:19: note: called from here
dh.c: mask_lps[7] = _mm_cmpeq_epi64(digits[7], dig_sse);
dh.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dh.c: In file included from /usr/lib64/gcc/x86_64-suse-linux/7/include/x86intrin.h:43:0,
dh.c: from lib.h:2,
dh.c: from dh.c:2:
dh.c: /usr/lib64/gcc/x86_64-suse-linux/7/include/smmintrin.h:268:1: error: inlining failed in call to always_inline '_mm_cmpeq_epi64': target specific option mismatch
dh.c: _mm_cmpeq_epi64 (__m128i __X, __m128i __Y)
dh.c: ^~~~~~~~~~~~~~~
dh.c: In file included from dh.c:8:0:
dh.c: smu.h:336:19: note: called from here
dh.c: mask_lps[6] = _mm_cmpeq_epi64(digits[6], dig_sse);
dh.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dh.c: In file included from /usr/lib64/gcc/x86_64-suse-linux/7/include/x86intrin.h:43:0,
dh.c: from lib.h:2,
dh.c: from dh.c:2:
dh.c: /usr/lib64/gcc/x86_64-suse-linux/7/include/smmintrin.h:268:1: error: inlining failed in call to always_inline '_mm_cmpeq_epi64': target specific option mismatch
dh.c: ...

Number of similar (compiler,implementation) pairs: 2, namely:
CompilerImplementations
gcc ref
gcc -funroll-loops ref

Compiler output

Implementation: ref
Security model: unknown
Compiler: gcc -O2 -fomit-frame-pointer
dh.c: In file included from /usr/lib64/gcc/x86_64-suse-linux/7/include/x86intrin.h:45:0,
dh.c: from lib.h:2,
dh.c: from dh.c:2:
dh.c: ffa.h: In function 'ffa_red_149':
dh.c: /usr/lib64/gcc/x86_64-suse-linux/7/include/wmmintrin.h:116:1: error: inlining failed in call to always_inline '_mm_clmulepi64_si128': target specific option mismatch
dh.c: _mm_clmulepi64_si128 (__m128i __X, __m128i __Y, const int __I)
dh.c: ^~~~~~~~~~~~~~~~~~~~
dh.c: In file included from dh.c:6:0:
dh.c: ffa.h:20:10: note: called from here
dh.c: tp_0 = _mm_clmulepi64_si128(p_149_1, tp_0, 0x00);\
dh.c: ^
dh.c: ffa.h:82:5: note: in expansion of macro 'ffa_red_149_stp'
dh.c: ffa_red_149_stp(b_00, b_01, tp_0, tp_1, tp_2, p_149_0, p_149_1);
dh.c: ^~~~~~~~~~~~~~~
dh.c: In file included from /usr/lib64/gcc/x86_64-suse-linux/7/include/x86intrin.h:45:0,
dh.c: from lib.h:2,
dh.c: from dh.c:2:
dh.c: /usr/lib64/gcc/x86_64-suse-linux/7/include/wmmintrin.h:116:1: error: inlining failed in call to always_inline '_mm_clmulepi64_si128': target specific option mismatch
dh.c: _mm_clmulepi64_si128 (__m128i __X, __m128i __Y, const int __I)
dh.c: ^~~~~~~~~~~~~~~~~~~~
dh.c: In file included from dh.c:6:0:
dh.c: ffa.h:19:10: note: called from here
dh.c: tp_1 = _mm_clmulepi64_si128(p_149_0, tp_0, 0x01);\
dh.c: ^
dh.c: ffa.h:82:5: note: in expansion of macro 'ffa_red_149_stp'
dh.c: ...

Number of similar (compiler,implementation) pairs: 84, namely:
CompilerImplementations
gcc -O2 -fomit-frame-pointer ref
gcc -O3 -fomit-frame-pointer ref
gcc -O -fomit-frame-pointer ref
gcc -Os -fomit-frame-pointer ref
gcc -fno-schedule-insns -O2 -fomit-frame-pointer ref
gcc -fno-schedule-insns -O3 -fomit-frame-pointer ref
gcc -fno-schedule-insns -O -fomit-frame-pointer ref
gcc -fno-schedule-insns -Os -fomit-frame-pointer ref
gcc -funroll-loops -O2 -fomit-frame-pointer ref
gcc -funroll-loops -O3 -fomit-frame-pointer ref
gcc -funroll-loops -O -fomit-frame-pointer ref
gcc -funroll-loops -Os -fomit-frame-pointer ref
gcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer ref
gcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer ref
gcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer ref
gcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer ref
gcc -funroll-loops -m64 -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m64 -O3 -fomit-frame-pointer ref
gcc -funroll-loops -m64 -O -fomit-frame-pointer ref
gcc -funroll-loops -m64 -Os -fomit-frame-pointer ref
gcc -funroll-loops -m64 -march=barcelona -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m64 -march=barcelona -O3 -fomit-frame-pointer ref
gcc -funroll-loops -m64 -march=barcelona -O -fomit-frame-pointer ref
gcc -funroll-loops -m64 -march=barcelona -Os -fomit-frame-pointer ref
gcc -funroll-loops -m64 -march=k8 -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m64 -march=k8 -O3 -fomit-frame-pointer ref
gcc -funroll-loops -m64 -march=k8 -O -fomit-frame-pointer ref
gcc -funroll-loops -m64 -march=k8 -Os -fomit-frame-pointer ref
gcc -funroll-loops -m64 -march=nocona -O2 -fomit-frame-pointer ref
gcc -funroll-loops -m64 -march=nocona -O3 -fomit-frame-pointer ref
gcc -funroll-loops -m64 -march=nocona -O -fomit-frame-pointer ref
gcc -funroll-loops -m64 -march=nocona -Os -fomit-frame-pointer ref
gcc -funroll-loops -march=barcelona -O2 -fomit-frame-pointer ref
gcc -funroll-loops -march=barcelona -O3 -fomit-frame-pointer ref
gcc -funroll-loops -march=barcelona -O -fomit-frame-pointer ref
gcc -funroll-loops -march=barcelona -Os -fomit-frame-pointer ref
gcc -funroll-loops -march=k8 -O2 -fomit-frame-pointer ref
gcc -funroll-loops -march=k8 -O3 -fomit-frame-pointer ref
gcc -funroll-loops -march=k8 -O -fomit-frame-pointer ref
gcc -funroll-loops -march=k8 -Os -fomit-frame-pointer ref
gcc -funroll-loops -march=nocona -O2 -fomit-frame-pointer ref
gcc -funroll-loops -march=nocona -O3 -fomit-frame-pointer ref
gcc -funroll-loops -march=nocona -O -fomit-frame-pointer ref
gcc -funroll-loops -march=nocona -Os -fomit-frame-pointer ref
gcc -m64 -O2 -fomit-frame-pointer ref
gcc -m64 -O3 -fomit-frame-pointer ref
gcc -m64 -O -fomit-frame-pointer ref
gcc -m64 -Os -fomit-frame-pointer ref
gcc -m64 -march=core2 -O2 -fomit-frame-pointer ref
gcc -m64 -march=core2 -O3 -fomit-frame-pointer ref
gcc -m64 -march=core2 -O -fomit-frame-pointer ref
gcc -m64 -march=core2 -Os -fomit-frame-pointer ref
gcc -m64 -march=core2 -msse4.1 -O2 -fomit-frame-pointer ref
gcc -m64 -march=core2 -msse4.1 -O3 -fomit-frame-pointer ref
gcc -m64 -march=core2 -msse4.1 -O -fomit-frame-pointer ref
gcc -m64 -march=core2 -msse4.1 -Os -fomit-frame-pointer ref
gcc -m64 -march=core2 -msse4 -O2 -fomit-frame-pointer ref
gcc -m64 -march=core2 -msse4 -O3 -fomit-frame-pointer ref
gcc -m64 -march=core2 -msse4 -O -fomit-frame-pointer ref
gcc -m64 -march=core2 -msse4 -Os -fomit-frame-pointer ref
gcc -m64 -march=corei7 -O2 -fomit-frame-pointer ref
gcc -m64 -march=corei7 -O3 -fomit-frame-pointer ref
gcc -m64 -march=corei7 -O -fomit-frame-pointer ref
gcc -m64 -march=corei7 -Os -fomit-frame-pointer ref
gcc -m64 -march=k8 -O2 -fomit-frame-pointer ref
gcc -m64 -march=k8 -O3 -fomit-frame-pointer ref
gcc -m64 -march=k8 -O -fomit-frame-pointer ref
gcc -m64 -march=k8 -Os -fomit-frame-pointer ref
gcc -m64 -march=nocona -O2 -fomit-frame-pointer ref
gcc -m64 -march=nocona -O3 -fomit-frame-pointer ref
gcc -m64 -march=nocona -O -fomit-frame-pointer ref
gcc -m64 -march=nocona -Os -fomit-frame-pointer ref
gcc -march=barcelona -O2 -fomit-frame-pointer ref
gcc -march=barcelona -O3 -fomit-frame-pointer ref
gcc -march=barcelona -O -fomit-frame-pointer ref
gcc -march=barcelona -Os -fomit-frame-pointer ref
gcc -march=k8 -O2 -fomit-frame-pointer ref
gcc -march=k8 -O3 -fomit-frame-pointer ref
gcc -march=k8 -O -fomit-frame-pointer ref
gcc -march=k8 -Os -fomit-frame-pointer ref
gcc -march=nocona -O2 -fomit-frame-pointer ref
gcc -march=nocona -O3 -fomit-frame-pointer ref
gcc -march=nocona -O -fomit-frame-pointer ref
gcc -march=nocona -Os -fomit-frame-pointer ref

Compiler output

Implementation: ref
Security model: unknown
Compiler: gcc -m64 -march=barcelona -O2 -fomit-frame-pointer
dh.c: In file included from /usr/lib64/gcc/x86_64-suse-linux/7/include/x86intrin.h:45:0,
dh.c: from lib.h:2,
dh.c: from dh.c:2:
dh.c: ffa.h: In function 'ffa_red_149':
dh.c: /usr/lib64/gcc/x86_64-suse-linux/7/include/wmmintrin.h:116:1: error: inlining failed in call to always_inline '_mm_clmulepi64_si128': target specific option mismatch
dh.c: _mm_clmulepi64_si128 (__m128i __X, __m128i __Y, const int __I)
dh.c: ^~~~~~~~~~~~~~~~~~~~
dh.c: In file included from dh.c:6:0:
dh.c: ffa.h:20:10: note: called from here
dh.c: tp_0 = _mm_clmulepi64_si128(p_149_1, tp_0, 0x00);\
dh.c: ^
dh.c: ffa.h:82:5: note: in expansion of macro 'ffa_red_149_stp'
dh.c: ffa_red_149_stp(b_00, b_01, tp_0, tp_1, tp_2, p_149_0, p_149_1);
dh.c: ^~~~~~~~~~~~~~~
dh.c: In file included from /usr/lib64/gcc/x86_64-suse-linux/7/include/x86intrin.h:45:0,
dh.c: from lib.h:2,
dh.c: from dh.c:2:
dh.c: /usr/lib64/gcc/x86_64-suse-linux/7/include/wmmintrin.h:116:1: error: inlining failed in call to always_inline '_mm_clmulepi64_si128': target specific option mismatch
dh.c: _mm_clmulepi64_si128 (__m128i __X, __m128i __Y, const int __I)
dh.c: ^~~~~~~~~~~~~~~~~~~~
dh.c: In file included from dh.c:6:0:
dh.c: ffa.h:19:10: note: called from here
dh.c: tp_1 = _mm_clmulepi64_si128(p_149_0, tp_0, 0x01);\
dh.c: ^
dh.c: ffa.h:82:5: note: in expansion of macro 'ffa_red_149_stp'
dh.c: ...
dh.c: In file included from /usr/lib64/gcc/x86_64-suse-linux/7/include/x86intrin.h:45:0,
dh.c: from lib.h:2,
dh.c: from dh.c:2:
dh.c: ffa.h: In function 'ffa_red_149':
dh.c: /usr/lib64/gcc/x86_64-suse-linux/7/include/wmmintrin.h:116:1: error: inlining failed in call to always_inline '_mm_clmulepi64_si128': target specific option mismatch
dh.c: _mm_clmulepi64_si128 (__m128i __X, __m128i __Y, const int __I)
dh.c: ^~~~~~~~~~~~~~~~~~~~
dh.c: In file included from dh.c:6:0:
dh.c: ffa.h:20:10: note: called from here
dh.c: tp_0 = _mm_clmulepi64_si128(p_149_1, tp_0, 0x00);\
dh.c: ^
dh.c: ffa.h:82:5: note: in expansion of macro 'ffa_red_149_stp'
dh.c: ffa_red_149_stp(b_00, b_01, tp_0, tp_1, tp_2, p_149_0, p_149_1);
dh.c: ^~~~~~~~~~~~~~~
dh.c: In file included from /usr/lib64/gcc/x86_64-suse-linux/7/include/x86intrin.h:45:0,
dh.c: from lib.h:2,
dh.c: from dh.c:2:
dh.c: /usr/lib64/gcc/x86_64-suse-linux/7/include/wmmintrin.h:116:1: error: inlining failed in call to always_inline '_mm_clmulepi64_si128': target specific option mismatch
dh.c: _mm_clmulepi64_si128 (__m128i __X, __m128i __Y, const int __I)
dh.c: ^~~~~~~~~~~~~~~~~~~~
dh.c: In file included from dh.c:6:0:
dh.c: ffa.h:19:10: note: called from here
dh.c: tp_1 = _mm_clmulepi64_si128(p_149_0, tp_0, 0x01);\
dh.c: ^
dh.c: ffa.h:82:5: note: in expansion of macro 'ffa_red_149_stp'
dh.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -m64 -march=barcelona -O2 -fomit-frame-pointer ref
gcc -m64 -march=barcelona -O3 -fomit-frame-pointer ref
gcc -m64 -march=barcelona -O -fomit-frame-pointer ref
gcc -m64 -march=barcelona -Os -fomit-frame-pointer ref