Implementation notes: amd64, hertz, crypto_hash/blake256

Computer: hertz
Microarchitecture: amd64; Zen 4 (a60f12)
Architecture: amd64
CPU ID: AuthenticAMD-00a60f12-178bfbff
SUPERCOP version: 20240425
Operation: crypto_hash
Primitive: blake256
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
124305897 0 023510 828 968sse41-2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
124325854 0 017352 820 968sse41-2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
124376417 0 024142 828 968sse41-2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
127755746 0 023358 828 968sse2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
127755690 0 017208 820 968sse2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
127816218 0 023942 828 968sse2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
127936668 0 024423 828 968avxsclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
128196148 0 023775 828 968avxsclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
128246112 0 017704 820 968avxsclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
128406531 0 017773 804 968avxsgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042920240425
128486171 0 016040 780 936avxsgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042920240425
128726531 0 019797 804 1032avxsgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042920240425
135509041 0 026823 828 968sphlib-smallclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
135989041 0 026935 828 968sphlib-smallclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
1377313066 0 024757 804 968bswapgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
1380413736 0 025405 804 968regsgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
138107260 0 018912 820 968sphlib-smallclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
1382313502 0 027229 804 1032bswapgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
1382414112 0 027853 804 1032regsgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
139546525 0 018056 820 968sse2-2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
1395810468 0 020712 780 936bswapgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
139606581 0 024190 828 968sse2-2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
139977053 0 024774 828 968sse2-2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
1400210462 0 020696 780 936regsgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
140442172 0 013736 820 968refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
1424628962 0 040733 804 968sphlibgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
142515988 0 016232 780 936sse41-2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
1427529513 0 043357 804 1032sphlibgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
144587510 0 021229 804 1032sse41gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
144746154 0 016392 780 936sse41gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
144907160 0 018845 804 968sse41gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
146566478 0 016720 780 936ssse3gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
146737144 0 018829 804 968ssse3gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
147147494 0 021213 804 1032ssse3gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
149012778 0 020447 828 968refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
1494010409 0 028151 828 968bswapclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
149759929 0 027559 828 968bswapclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
150718006 0 021741 804 1032sse2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
151317656 0 019325 804 968sse2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
151367274 0 017512 780 936sse2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
152953714 0 021495 828 968refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
153607896 0 018136 780 936sse2-2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
154036559 0 020285 804 1032sse41-2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
154266559 0 018253 804 968sse41-2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
154519174 0 022909 804 1032sse2-2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
154678824 0 020493 804 968sse2-2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
1573326061 0 043895 828 968sphlibclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
1579326061 0 043783 828 968sphlibclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
1600210264 0 021824 820 968bswapclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
1611426100 0 037744 820 968sphlibclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
1625324536 0 034872 780 936sphlibgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
1651511088 0 028734 828 968regsclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
1651511560 0 029318 828 968regsclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
1760311048 0 022568 820 968regsclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
187577650 0 017976 780 936sphlib-smallgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
210132693 0 012936 780 936refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
2285811074 0 022845 804 968sphlib-smallgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
2288611609 0 025453 804 1032sphlib-smallgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
239415280 0 018989 804 1032refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
247504840 0 016509 804 968refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
2699311074 0 021320 780 936sandygcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
2712110824 0 028583 828 968sandyclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
2713510344 0 027991 828 968sandyclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
2750710442 0 021984 820 968sandyclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024042920240425
2762512906 0 024597 804 968sandygcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217
2773413342 0 027069 804 1032sandygcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121920231217

Compiler output

Implementation: sse41
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
hash.c: In file included from hash.c:121:
hash.c: ./rounds.sse41.h:17:55: warning: implicit conversion from 'long' to 'int' changes value from 2242054355 to -2052912941 [-Wconstant-conversion]
hash.c: 17 | buf2 = _mm_set_epi32(3964562569, 698298832, 57701188, 2242054355);
hash.c: | ~~~~~~~~~~~~~ ^~~~~~~~~~
hash.c: ./rounds.sse41.h:17:22: warning: implicit conversion from 'long' to 'int' changes value from 3964562569 to -330404727 [-Wconstant-conversion]
hash.c: 17 | buf2 = _mm_set_epi32(3964562569, 698298832, 57701188, 2242054355);
hash.c: | ~~~~~~~~~~~~~ ^~~~~~~~~~
hash.c: ./rounds.sse41.h:20:33: warning: implicit conversion from 'long' to 'int' changes value from 2752067618 to -1542899678 [-Wconstant-conversion]
hash.c: 20 | buf1 = _mm_set_epi32(137296536, 2752067618, 320440878, 608135816);
hash.c: | ~~~~~~~~~~~~~ ^~~~~~~~~~
hash.c: ./rounds.sse41.h:47:34: warning: implicit conversion from 'long' to 'int' changes value from 3380367581 to -914599715 [-Wconstant-conversion]
hash.c: 47 | buf2 = _mm_set_epi32(3041331479, 3380367581, 887688300, 953160567);
hash.c: | ~~~~~~~~~~~~~ ^~~~~~~~~~
hash.c: ./rounds.sse41.h:47:22: warning: implicit conversion from 'long' to 'int' changes value from 3041331479 to -1253635817 [-Wconstant-conversion]
hash.c: 47 | buf2 = _mm_set_epi32(3041331479, 3380367581, 887688300, 953160567);
hash.c: | ~~~~~~~~~~~~~ ^~~~~~~~~~
hash.c: ./rounds.sse41.h:50:46: warning: implicit conversion from 'long' to 'int' changes value from 3193202383 to -1101764913 [-Wconstant-conversion]
hash.c: 50 | buf1 = _mm_set_epi32(1065670069, 3232508343, 3193202383, 1160258022);
hash.c: | ~~~~~~~~~~~~~ ^~~~~~~~~~
hash.c: ./rounds.sse41.h:50:34: warning: implicit conversion from 'long' to 'int' changes value from 3232508343 to -1062458953 [-Wconstant-conversion]
hash.c: 50 | buf1 = _mm_set_epi32(1065670069, 3232508343, 3193202383, 1160258022);
hash.c: | ~~~~~~~~~~~~~ ^~~~~~~~~~
hash.c: ./rounds.sse41.h:81:57: warning: implicit conversion from 'long' to 'int' changes value from 3193202383 to -1101764913 [-Wconstant-conversion]
hash.c: 81 | buf2 = _mm_set_epi32(137296536, 3041331479, 1160258022, 3193202383);
hash.c: | ~~~~~~~~~~~~~ ^~~~~~~~~~
hash.c: ...

Number of similar (compiler,implementation) pairs: 3, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE sse41
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE sse41
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE sse41

Compiler output

Implementation: sse41-2
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
hash.c: In file included from hash.c:2:
hash.c: ./blake256.h:105:15: warning: '_mm_roti_epi32' macro redefined [-Wmacro-redefined]
hash.c: 105 | #define _mm_roti_epi32(r, c) ((8==-c) ? _mm_shuffle_epi8(r,r8) : ( (16==-c) ? _mm_shuffle_epi8(r,r16) : _mm_xor_si128(_mm_srli_epi32( (r), -(c) ),_mm_slli_epi32( (r), 32-(-c) )) ) )
hash.c: | ^
hash.c: /usr/lib/llvm-18/lib/clang/18/include/xopintrin.h:233:9: note: previous definition is here
hash.c: 233 | #define _mm_roti_epi32(A, N) \
hash.c: | ^
hash.c: 1 warning generated.

Number of similar (compiler,implementation) pairs: 3, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE sse41-2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE sse41-2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE sse41-2

Compiler output

Implementation: ssse3
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
hash.c: In file included from hash.c:122:
hash.c: ./rounds.ssse3.h:3:55: warning: implicit conversion from 'long' to 'int' changes value from 2242054355 to -2052912941 [-Wconstant-conversion]
hash.c: 3 | buf2 = _mm_set_epi32(3964562569, 698298832, 57701188, 2242054355);
hash.c: | ~~~~~~~~~~~~~ ^~~~~~~~~~
hash.c: ./rounds.ssse3.h:3:22: warning: implicit conversion from 'long' to 'int' changes value from 3964562569 to -330404727 [-Wconstant-conversion]
hash.c: 3 | buf2 = _mm_set_epi32(3964562569, 698298832, 57701188, 2242054355);
hash.c: | ~~~~~~~~~~~~~ ^~~~~~~~~~
hash.c: ./rounds.ssse3.h:6:33: warning: implicit conversion from 'long' to 'int' changes value from 2752067618 to -1542899678 [-Wconstant-conversion]
hash.c: 6 | buf1 = _mm_set_epi32(137296536, 2752067618, 320440878, 608135816);
hash.c: | ~~~~~~~~~~~~~ ^~~~~~~~~~
hash.c: ./rounds.ssse3.h:27:34: warning: implicit conversion from 'long' to 'int' changes value from 3380367581 to -914599715 [-Wconstant-conversion]
hash.c: 27 | buf2 = _mm_set_epi32(3041331479, 3380367581, 887688300, 953160567);
hash.c: | ~~~~~~~~~~~~~ ^~~~~~~~~~
hash.c: ./rounds.ssse3.h:27:22: warning: implicit conversion from 'long' to 'int' changes value from 3041331479 to -1253635817 [-Wconstant-conversion]
hash.c: 27 | buf2 = _mm_set_epi32(3041331479, 3380367581, 887688300, 953160567);
hash.c: | ~~~~~~~~~~~~~ ^~~~~~~~~~
hash.c: ./rounds.ssse3.h:30:46: warning: implicit conversion from 'long' to 'int' changes value from 3193202383 to -1101764913 [-Wconstant-conversion]
hash.c: 30 | buf1 = _mm_set_epi32(1065670069, 3232508343, 3193202383, 1160258022);
hash.c: | ~~~~~~~~~~~~~ ^~~~~~~~~~
hash.c: ./rounds.ssse3.h:30:34: warning: implicit conversion from 'long' to 'int' changes value from 3232508343 to -1062458953 [-Wconstant-conversion]
hash.c: 30 | buf1 = _mm_set_epi32(1065670069, 3232508343, 3193202383, 1160258022);
hash.c: | ~~~~~~~~~~~~~ ^~~~~~~~~~
hash.c: ./rounds.ssse3.h:51:57: warning: implicit conversion from 'long' to 'int' changes value from 3193202383 to -1101764913 [-Wconstant-conversion]
hash.c: 51 | buf2 = _mm_set_epi32(137296536, 3041331479, 1160258022, 3193202383);
hash.c: | ~~~~~~~~~~~~~ ^~~~~~~~~~
hash.c: ...

Number of similar (compiler,implementation) pairs: 3, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ssse3
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ssse3
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ssse3

Compiler output

Implementation: xop
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
hash.c: hash.c:115:3: error: always_inline function '_mm_perm_epi8' requires target feature 'xop', but would be inlined into function 'blake256_compress' that is compiled without support for 'xop'
hash.c: 115 | ROUND( 0);
hash.c: | ^
hash.c: ./rounds.h:51:3: note: expanded from macro 'ROUND'
hash.c: 51 | LOAD_MSG_ ##r ##_1(buf1); \
hash.c: | ^
hash.c: <scratch space>:215:1: note: expanded from here
hash.c: 215 | LOAD_MSG_0_1
hash.c: | ^
hash.c: ./load.xop.h:19:6: note: expanded from macro 'LOAD_MSG_0_1'
hash.c: 19 | s0 = _mm_perm_epi8(m0, m1, _mm_set_epi32(TOB(6),TOB(4),TOB(2),TOB(0)) ); \
hash.c: | ^
hash.c: hash.c:115:3: error: '__builtin_ia32_vprotdi' needs target feature xop
hash.c: ./rounds.h:52:3: note: expanded from macro 'ROUND'
hash.c: 52 | G1(row1,row2,row3,row4,buf1); \
hash.c: | ^
hash.c: ./rounds.h:8:10: note: expanded from macro 'G1'
hash.c: 8 | row4 = _mm_roti_epi32(row4, -16); \
hash.c: | ^
hash.c: /usr/lib/llvm-18/lib/clang/18/include/xopintrin.h:234:13: note: expanded from macro '_mm_roti_epi32'
hash.c: 234 | ((__m128i)__builtin_ia32_vprotdi((__v4si)(__m128i)(A), (N)))
hash.c: | ^
hash.c: hash.c:115:3: error: '__builtin_ia32_vprotdi' needs target feature xop
hash.c: ./rounds.h:52:3: note: expanded from macro 'ROUND'
hash.c: 52 | G1(row1,row2,row3,row4,buf1); \
hash.c: ...

Number of similar (compiler,implementation) pairs: 3, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE xop
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE xop
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE xop

Compiler output

Implementation: xop
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
hash.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/13/include/x86intrin.h:38,
hash.c: from blake256.h:7,
hash.c: from hash.c:2:
hash.c: /usr/lib/gcc/x86_64-linux-gnu/13/include/xopintrin.h: In function 'blake256_compress':
hash.c: /usr/lib/gcc/x86_64-linux-gnu/13/include/xopintrin.h:266:1: error: inlining failed in call to 'always_inline' '_mm_roti_epi32': target specific option mismatch
hash.c: 266 | _mm_roti_epi32(__m128i __A, const int __B)
hash.c: | ^~~~~~~~~~~~~~
hash.c: In file included from blake256.h:127:
hash.c: rounds.h:19:10: note: called from here
hash.c: 19 | row2 = _mm_roti_epi32(row2, -7); \
hash.c: | ^~~~~~~~~~~~~~~~~~~~~~~~
hash.c: rounds.h:59:3: note: in expansion of macro 'G2'
hash.c: 59 | G2(row1,row2,row3,row4,buf4); \
hash.c: | ^~
hash.c: hash.c:128:3: note: in expansion of macro 'ROUND'
hash.c: 128 | ROUND(13);
hash.c: | ^~~~~
hash.c: /usr/lib/gcc/x86_64-linux-gnu/13/include/xopintrin.h:266:1: error: inlining failed in call to 'always_inline' '_mm_roti_epi32': target specific option mismatch
hash.c: 266 | _mm_roti_epi32(__m128i __A, const int __B)
hash.c: | ^~~~~~~~~~~~~~
hash.c: rounds.h:16:10: note: called from here
hash.c: 16 | row4 = _mm_roti_epi32(row4, -8); \
hash.c: | ^~~~~~~~~~~~~~~~~~~~~~~~
hash.c: rounds.h:59:3: note: in expansion of macro 'G2'
hash.c: 59 | G2(row1,row2,row3,row4,buf4); \
hash.c: ...

Number of similar (compiler,implementation) pairs: 3, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE xop
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE xop
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE xop