Implementation notes: amd64, hydra7, crypto_sort/int32

Computer: hydra7
Microarchitecture: amd64; Sandy Bridge+AES (206a7)
Architecture: amd64
CPU ID: GenuineIntel-000206a7-bfebfbff
SUPERCOP version: 20240425
Operation: crypto_sort
Primitive: int32
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
239381525 0 014093 804 960T:radix256mlgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
246121303 0 011224 780 928T:radix256mlgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
247031525 0 012845 804 960T:radix256mlgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
264291946 0 011872 780 928T:radix256smlgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
270672205 0 013533 804 960T:radix256smlgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
272612264 0 013260 796 960T:radix256smlgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
276802205 0 014781 804 960T:radix256smlgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
278421535 0 012524 796 960T:radix256mlgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
312311328 0 012719 812 992T:stdsortg++_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
317261141 0 012238 804 992T:stdsortg++_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
32565632 0 010536 780 928T:herfgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
33147710 0 012005 804 960T:herfgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
33995709 0 013253 804 960T:herfgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
350081364 0 014023 812 992T:stdsortg++_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
38671704 0 011684 796 960T:herfgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
39256930 0 010946 788 960T:stdsortg++_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
746478840 0 020141 804 960x863gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
761909258 0 021813 804 960x863gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
779989360 0 020348 796 960x863gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
883281315 0 013861 804 960x86gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
899801315 0 012613 804 960x86gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
913371115 0 011024 780 928x86gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
916471416 0 012396 796 960x86gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
957403935 0 013872 780 928x863gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
1080391664 0 014205 804 960portable4gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
116806731 0 010632 780 928portable4gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
118030887 0 012189 804 960portable4gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
125839947 0 011916 796 960portable4gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
1262201333 0 011248 780 928portable5gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
128304469 0 011765 804 960compactgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
128462445 0 011749 804 960portable3gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
128779445 0 012997 804 960portable3gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
128832445 0 012997 804 960compactgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
1296152328 0 014877 804 960portable5gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
1301711555 0 012861 804 960portable5gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
135695474 0 011444 796 960portable3gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
135705385 0 010296 780 928portable3gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
1370611651 0 012620 796 960portable5gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
185169460 0 010376 780 928compactgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425
205283612 0 011588 796 960compactgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024042620240425

Test failure

Implementation: T:krasnov
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
error 111

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:krasnov
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:krasnov
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:krasnov
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:krasnov

Compiler output

Implementation: T:aspas
Security model: timingleaks
Compiler: g++ -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
api.cpp: In file included from sorter.h:123,
api.cpp: from aspas.tcc:31,
api.cpp: from aspas.h:204,
api.cpp: from api.cpp:3:
api.cpp: sorter_avx.tcc:95:47: warning: ignoring attributes on template argument '__m256' [-Wignored-attributes]
api.cpp: 95 | typename std::enable_if<std::is_same<T, __m256>::value>::type
api.cpp: | ^
api.cpp: sorter_avx.tcc:149:47: warning: ignoring attributes on template argument '__m256' [-Wignored-attributes]
api.cpp: 149 | typename std::enable_if<std::is_same<T, __m256>::value>::type
api.cpp: | ^
api.cpp: sorter_avx.tcc:317:47: warning: ignoring attributes on template argument '__m256' [-Wignored-attributes]
api.cpp: 317 | typename std::enable_if<std::is_same<T, __m256>::value>::type
api.cpp: | ^
api.cpp: sorter_avx.tcc:647:48: warning: ignoring attributes on template argument '__m256i' [-Wignored-attributes]
api.cpp: 647 | typename std::enable_if<std::is_same<T, __m256i>::value>::type
api.cpp: | ^
api.cpp: sorter_avx.tcc:701:48: warning: ignoring attributes on template argument '__m256i' [-Wignored-attributes]
api.cpp: 701 | typename std::enable_if<std::is_same<T, __m256i>::value>::type
api.cpp: | ^
api.cpp: sorter_avx.tcc:854:48: warning: ignoring attributes on template argument '__m256i' [-Wignored-attributes]
api.cpp: 854 | typename std::enable_if<std::is_same<T, __m256i>::value>::type
api.cpp: | ^
api.cpp: sorter_avx.tcc:1184:48: warning: ignoring attributes on template argument '__m256d' [-Wignored-attributes]
api.cpp: 1184 | typename std::enable_if<std::is_same<T, __m256d>::value>::type
api.cpp: | ^
api.cpp: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
g++ -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:aspas
g++ -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:aspas
g++ -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:aspas
g++ -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:aspas

Compiler output

Implementation: avx2
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
sort.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
sort.c: from sort.c:4:
sort.c: sort.c: In function 'minmax_vector':
sort.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:363:1: error: inlining failed in call to 'always_inline' '_mm256_max_epi32': target specific option mismatch
sort.c: 363 | _mm256_max_epi32 (__m256i __A, __m256i __B)
sort.c: | ^~~~~~~~~~~~~~~~
sort.c: sort.c:12:21: note: called from here
sort.c: 12 | #define int32x8_max _mm256_max_epi32
sort.c: | ^
sort.c: sort.c:17:7: note: in expansion of macro 'int32x8_max'
sort.c: 17 | b = int32x8_max(a,b); \
sort.c: | ^~~~~~~~~~~
sort.c: sort.c:36:5: note: in expansion of macro 'int32x8_MINMAX'
sort.c: 36 | int32x8_MINMAX(x0,y0);
sort.c: | ^~~~~~~~~~~~~~
sort.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
sort.c: from sort.c:4:
sort.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:405:1: error: inlining failed in call to 'always_inline' '_mm256_min_epi32': target specific option mismatch
sort.c: 405 | _mm256_min_epi32 (__m256i __A, __m256i __B)
sort.c: | ^~~~~~~~~~~~~~~~
sort.c: sort.c:11:21: note: called from here
sort.c: 11 | #define int32x8_min _mm256_min_epi32
sort.c: | ^
sort.c: sort.c:16:15: note: in expansion of macro 'int32x8_min'
sort.c: 16 | int32x8 c = int32x8_min(a,b); \
sort.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2

Compiler output

Implementation: oldavx2
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
int32_sort.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
int32_sort.c: from int32_sort.c:3:
int32_sort.c: int32_sort.c: In function 'minmax02through1315':
int32_sort.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:878:1: error: inlining failed in call to 'always_inline' '_mm256_unpackhi_epi64': target specific option mismatch
int32_sort.c: 878 | _mm256_unpackhi_epi64 (__m256i __A, __m256i __B)
int32_sort.c: | ^~~~~~~~~~~~~~~~~~~~~
int32_sort.c: int32_sort.c:23:7: note: called from here
int32_sort.c: 23 | b = _mm256_unpackhi_epi64(g,h);
int32_sort.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~
int32_sort.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
int32_sort.c: from int32_sort.c:3:
int32_sort.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:906:1: error: inlining failed in call to 'always_inline' '_mm256_unpacklo_epi64': target specific option mismatch
int32_sort.c: 906 | _mm256_unpacklo_epi64 (__m256i __A, __m256i __B)
int32_sort.c: | ^~~~~~~~~~~~~~~~~~~~~
int32_sort.c: int32_sort.c:22:7: note: called from here
int32_sort.c: 22 | a = _mm256_unpacklo_epi64(g,h);
int32_sort.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~
int32_sort.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
int32_sort.c: from int32_sort.c:3:
int32_sort.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:363:1: error: inlining failed in call to 'always_inline' '_mm256_max_epi32': target specific option mismatch
int32_sort.c: 363 | _mm256_max_epi32 (__m256i __A, __m256i __B)
int32_sort.c: | ^~~~~~~~~~~~~~~~
int32_sort.c: int32_sort.c:21:15: note: called from here
int32_sort.c: 21 | __m256i h = _mm256_max_epi32(c,d);
int32_sort.c: | ^~~~~~~~~~~~~~~~~~~~~
int32_sort.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE oldavx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE oldavx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE oldavx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE oldavx2

Compiler output

Implementation: T:sid1607
Security model: timingleaks
Compiler: g++ -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
merge_sort.cpp: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
merge_sort.cpp: from merge_sort.h:1,
merge_sort.cpp: from merge_sort.cpp:1:
merge_sort.cpp: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h: In function '__m256i reverse(__m256i&)':
merge_sort.cpp: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:1044:1: error: inlining failed in call to 'always_inline' '__m256i _mm256_permutevar8x32_epi32(__m256i, __m256i)': target specific option mismatch
merge_sort.cpp: 1044 | _mm256_permutevar8x32_epi32 (__m256i __X, __m256i __Y)
merge_sort.cpp: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~
merge_sort.cpp: merge_sort.cpp:11:37: note: called from here
merge_sort.cpp: 11 | return _mm256_permutevar8x32_epi32(v, global_masks.rev_idx_mask);
merge_sort.cpp: | ~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
g++ -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sid1607
g++ -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sid1607
g++ -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sid1607
g++ -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sid1607