Implementation notes: amd64, hydra7, crypto_sort/int32

Computer: hydra7
Microarchitecture: amd64; Sandy Bridge+AES (206a7)
Architecture: amd64
CPU ID: GenuineIntel-000206a7-bfebfbff
SUPERCOP version: 20240625
Operation: crypto_sort
Primitive: int32
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
244411525 0 012845 804 960T:radix256mlgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
247191525 0 014093 804 960T:radix256mlgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
250762205 0 013533 804 960T:radix256smlgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
252452205 0 014781 804 960T:radix256smlgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
273202264 0 013260 796 960T:radix256smlgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
277111303 0 011224 780 928T:radix256mlgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
277831535 0 012524 796 960T:radix256mlgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
284841946 0 011872 780 928T:radix256smlgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
314871141 0 012238 804 992T:stdsortg++_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
323371328 0 012719 812 992T:stdsortg++_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
33474710 0 012005 804 960T:herfgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
33686632 0 010536 780 928T:herfgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
33898709 0 013253 804 960T:herfgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
364961364 0 014023 812 992T:stdsortg++_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
39690704 0 011684 796 960T:herfgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
41792930 0 010946 788 960T:stdsortg++_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
746588840 0 020141 804 960x863gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
759829258 0 021813 804 960x863gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
76782717 0 012013 804 960portable4gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
77111717 0 013261 804 960portable4gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
781609360 0 020348 796 960x863gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
83062801 0 011772 796 960portable4gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
87323375 0 012925 804 960portable3gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
87337375 0 012925 804 960compactgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
886751315 0 013861 804 960x86gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
888711283 0 013829 804 960portable5gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
894521283 0 012581 804 960portable5gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
899291315 0 012613 804 960x86gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
91370375 0 011677 804 960portable3gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
91472375 0 011677 804 960compactgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
915731416 0 012396 796 960x86gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
915931401 0 012380 796 960portable5gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
923381115 0 011024 780 928x86gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
959983935 0 013872 780 928x863gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
96797406 0 011380 796 960portable3gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
97024406 0 011380 796 960compactgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
97745320 0 010232 780 928portable3gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
97954320 0 010232 780 928compactgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
181110695 0 010616 780 928portable4gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625
1963741179 0 011096 780 928portable5gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062620240625

Test failure


error 111

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:krasnovgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:krasnovgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:krasnovgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:krasnovgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Compiler output


api.cpp: In file included from sorter_avx.tcc:27,
api.cpp:                  from sorter.h:123,
api.cpp:                  from aspas.tcc:31,
api.cpp:                  from aspas.h:204,
api.cpp:                  from api.cpp:3:
api.cpp: util/extintrin.h: In function '__m256i util::_my_mm256_min_epi32(__m256i, __m256i)':
api.cpp: util/extintrin.h:37:13: warning: unused variable 'max1' [-Wunused-variable]
api.cpp:    37 |     __m128i max1, min1, max2, min2;
api.cpp:       |             ^~~~
api.cpp: util/extintrin.h:37:25: warning: unused variable 'max2' [-Wunused-variable]
api.cpp:    37 |     __m128i max1, min1, max2, min2;
api.cpp:       |                         ^~~~
api.cpp: util/extintrin.h: In function '__m256i util::_my_mm256_max_epi32(__m256i, __m256i)':
api.cpp: util/extintrin.h:53:19: warning: unused variable 'min1' [-Wunused-variable]
api.cpp:    53 |     __m128i max1, min1, max2, min2;
api.cpp:       |                   ^~~~
api.cpp: util/extintrin.h:53:31: warning: unused variable 'min2' [-Wunused-variable]
api.cpp:    53 |     __m128i max1, min1, max2, min2;
api.cpp:       |                               ^~~~
api.cpp: In file included from sorter.h:123,
api.cpp:                  from aspas.tcc:31,
api.cpp:                  from aspas.h:204,
api.cpp:                  from api.cpp:3:
api.cpp: sorter_avx.tcc: At global scope:
api.cpp: sorter_avx.tcc:95:47: warning: ignoring attributes on template argument '__m256' [-Wignored-attributes]
api.cpp: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:aspasg++ -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:aspasg++ -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:aspasg++ -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:aspasg++ -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Compiler output


sort.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
sort.c:                  from sort.c:4:
sort.c: sort.c: In function 'minmax_vector':
sort.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:363:1: error: inlining failed in call to 'always_inline' '_mm256_max_epi32': target specific option mismatch
sort.c:   363 | _mm256_max_epi32 (__m256i __A, __m256i __B)
sort.c:       | ^~~~~~~~~~~~~~~~
sort.c: sort.c:12:21: note: called from here
sort.c:    12 | #define int32x8_max _mm256_max_epi32
sort.c:       |                     ^
sort.c: sort.c:17:7: note: in expansion of macro 'int32x8_max'
sort.c:    17 |   b = int32x8_max(a,b); \
sort.c:       |       ^~~~~~~~~~~
sort.c: sort.c:36:5: note: in expansion of macro 'int32x8_MINMAX'
sort.c:    36 |     int32x8_MINMAX(x0,y0);
sort.c:       |     ^~~~~~~~~~~~~~
sort.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
sort.c:                  from sort.c:4:
sort.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:405:1: error: inlining failed in call to 'always_inline' '_mm256_min_epi32': target specific option mismatch
sort.c:   405 | _mm256_min_epi32 (__m256i __A, __m256i __B)
sort.c:       | ^~~~~~~~~~~~~~~~
sort.c: sort.c:11:21: note: called from here
sort.c:    11 | #define int32x8_min _mm256_min_epi32
sort.c:       |                     ^
sort.c: sort.c:16:15: note: in expansion of macro 'int32x8_min'
sort.c:    16 |   int32x8 c = int32x8_min(a,b); \
sort.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
avx2gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
avx2gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
avx2gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
avx2gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Compiler output


sort.c: sort.c: In function 'crypto_sort_int32_herf_timingleaks':
sort.c: sort.c:133:17: warning: pointer targets in passing argument 2 of 'RadixSort11' differ in signedness [-Wpointer-sign]
sort.c:   133 |   RadixSort11(x,y,n);
sort.c:       |                 ^
sort.c:       |                 |
sort.c:       |                 int32 * {aka int *}
sort.c: sort.c:47:48: note: expected 'uint32 *' {aka 'unsigned int *'} but argument is of type 'int32 *' {aka 'int *'}
sort.c:    47 | static void RadixSort11(uint32 *array, uint32 *sort, uint32 elements)
sort.c:       |                                        ~~~~~~~~^~~~

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:herfgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:herfgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:herfgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:herfgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Compiler output


int32_sort.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
int32_sort.c:                  from int32_sort.c:3:
int32_sort.c: int32_sort.c: In function 'minmax02through1315':
int32_sort.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:878:1: error: inlining failed in call to 'always_inline' '_mm256_unpackhi_epi64': target specific option mismatch
int32_sort.c:   878 | _mm256_unpackhi_epi64 (__m256i __A, __m256i __B)
int32_sort.c:       | ^~~~~~~~~~~~~~~~~~~~~
int32_sort.c: int32_sort.c:23:7: note: called from here
int32_sort.c:    23 |   b = _mm256_unpackhi_epi64(g,h);
int32_sort.c:       |       ^~~~~~~~~~~~~~~~~~~~~~~~~~
int32_sort.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
int32_sort.c:                  from int32_sort.c:3:
int32_sort.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:906:1: error: inlining failed in call to 'always_inline' '_mm256_unpacklo_epi64': target specific option mismatch
int32_sort.c:   906 | _mm256_unpacklo_epi64 (__m256i __A, __m256i __B)
int32_sort.c:       | ^~~~~~~~~~~~~~~~~~~~~
int32_sort.c: int32_sort.c:22:7: note: called from here
int32_sort.c:    22 |   a = _mm256_unpacklo_epi64(g,h);
int32_sort.c:       |       ^~~~~~~~~~~~~~~~~~~~~~~~~~
int32_sort.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
int32_sort.c:                  from int32_sort.c:3:
int32_sort.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:363:1: error: inlining failed in call to 'always_inline' '_mm256_max_epi32': target specific option mismatch
int32_sort.c:   363 | _mm256_max_epi32 (__m256i __A, __m256i __B)
int32_sort.c:       | ^~~~~~~~~~~~~~~~
int32_sort.c: int32_sort.c:21:15: note: called from here
int32_sort.c:    21 |   __m256i h = _mm256_max_epi32(c,d);
int32_sort.c:       |               ^~~~~~~~~~~~~~~~~~~~~
int32_sort.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
oldavx2gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
oldavx2gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
oldavx2gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
oldavx2gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Compiler output


merge_sort.cpp: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
merge_sort.cpp:                  from merge_sort.h:1,
merge_sort.cpp:                  from merge_sort.cpp:1:
merge_sort.cpp: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h: In function '__m256i reverse(__m256i&)':
merge_sort.cpp: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:1044:1: error: inlining failed in call to 'always_inline' '__m256i _mm256_permutevar8x32_epi32(__m256i, __m256i)': target specific option mismatch
merge_sort.cpp:  1044 | _mm256_permutevar8x32_epi32 (__m256i __X, __m256i __Y)
merge_sort.cpp:       | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
merge_sort.cpp: merge_sort.cpp:11:37: note: called from here
merge_sort.cpp:    11 |   return _mm256_permutevar8x32_epi32(v, global_masks.rev_idx_mask);
merge_sort.cpp:       |          ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:sid1607g++ -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:sid1607g++ -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:sid1607g++ -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:sid1607g++ -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Passed TIMECOP


TIMECOP iterations: 10

Number of similar (implementation,compiler) pairs: 24, namely:
ImplementationCompiler
compactgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
compactgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
compactgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
compactgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
portable3gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
portable3gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
portable3gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
portable3gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
portable4gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
portable4gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
portable4gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
portable4gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
portable5gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
portable5gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
portable5gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
portable5gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
x86gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
x86gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
x86gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
x86gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
x863gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
x863gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
x863gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
x863gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)