Implementation notes: amd64, colossus7, crypto_sort/int32

Computer: colossus7
Architecture: amd64
CPU ID: AuthenticAMD-00830f10-178bfbff
SUPERCOP version: 20210125
Operation: crypto_sort
Primitive: int32
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
30829245 0 018488 744 736avx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
319510522 0 022174 752 752avx2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
32409994 0 021790 752 752avx2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
32409994 0 021790 752 752avx2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
85271229 0 012966 752 752T:radix256mlclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
85281229 0 012838 752 752T:radix256mlclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
85501170 0 010368 744 736T:radix256mlclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
91351229 0 012966 752 752T:radix256mlclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
93602075 0 013726 752 736T:radix256mlclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
98555074 0 016678 752 752oldavx2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
98553552 0 012776 744 736oldavx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
100574754 0 016486 752 752oldavx2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
100574754 0 016486 752 752oldavx2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
1098017912 0 029670 752 752T:krasnovclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
1107017912 0 029526 752 752T:krasnovclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
1111517912 0 029670 752 752T:krasnovclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
1118217913 0 029582 752 736T:krasnovclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
116552280 0 013902 752 752T:radix256smlclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
117452280 0 014030 752 752T:radix256smlclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
121503241 0 014902 752 736T:radix256smlclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
12218546 0 012150 752 752T:herfclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
12240513 0 09696 744 736T:herfclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
1246517869 0 027088 744 736T:krasnovclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
12578950 0 012590 752 736T:herfclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
12757546 0 012278 752 752T:herfclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
12780546 0 012278 752 752T:herfclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
131402280 0 014030 752 752T:radix256smlclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
145131831 0 011040 744 736T:radix256smlclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
166731726 0 013589 792 744T:stdsortclang++_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
169421726 0 013589 792 744T:stdsortclang++_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
178421245 0 010599 784 728T:stdsortclang++_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
182021616 0 013349 792 744T:stdsortclang++_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
418955306 0 016926 752 752x863clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
420985336 0 017070 752 752x863clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
422784121 0 013328 744 736x863clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
423455336 0 017070 752 752x863clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
423675290 0 016950 752 736x863clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
443471182 0 010368 744 736x86clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
443921441 0 013174 752 752x86clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
444151601 0 013206 752 752x86clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
449101441 0 013174 752 752x86clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
456751586 0 013230 752 736x86clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
541801268 0 012870 752 752portable4clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
545851207 0 012934 752 752portable4clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
547651207 0 012934 752 752portable4clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
576901265 0 012910 752 736portable4clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
591522520 0 014118 752 752portable5clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
601422414 0 014134 752 752portable5clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
605252414 0 014134 752 752portable5clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
61583747 0 09936 744 736portable4clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
628872583 0 014222 752 736portable5clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
64283816 0 012422 752 752portable3clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
649801447 0 010640 744 736portable5clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
66757434 0 012078 752 736portable3clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
68895812 0 012566 752 752portable3clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
69660812 0 012566 752 752portable3clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125
71347371 0 09552 744 736portable3clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2021031120210125

Test failure

Implementation: T:sid1607
Security model: timingleaks
Compiler: clang++ -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
error 111

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
clang++ -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sid1607
clang++ -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sid1607
clang++ -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sid1607
clang++ -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:sid1607

Compiler output

Implementation: T:aspas
Security model: timingleaks
Compiler: clang++ -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
api.cpp: In file included from api.cpp:3:
api.cpp: In file included from ./aspas.h:204:
api.cpp: In file included from ./aspas.tcc:32:
api.cpp: In file included from ./merger.h:64:
api.cpp: ./merger.tcc:187:17: warning: & has lower precedence than ==; == will be evaluated first [-Wparentheses]
api.cpp: if (num & 1 == 1)
api.cpp: ^~~~~~~~
api.cpp: ./merger.tcc:187:17: note: place parentheses around the '==' expression to silence this warning
api.cpp: if (num & 1 == 1)
api.cpp: ^
api.cpp: ( )
api.cpp: ./merger.tcc:187:17: note: place parentheses around the & expression to evaluate it first
api.cpp: if (num & 1 == 1)
api.cpp: ^
api.cpp: ( )
api.cpp: ./merger.tcc:196:13: warning: & has lower precedence than ==; == will be evaluated first [-Wparentheses]
api.cpp: if(count&1==1)
api.cpp: ^~~~~
api.cpp: ./merger.tcc:196:13: note: place parentheses around the '==' expression to silence this warning
api.cpp: if(count&1==1)
api.cpp: ^
api.cpp: ( )
api.cpp: ./merger.tcc:196:13: note: place parentheses around the & expression to evaluate it first
api.cpp: if(count&1==1)
api.cpp: ^
api.cpp: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
clang++ -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:aspas
clang++ -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:aspas
clang++ -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:aspas
clang++ -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:aspas

Compiler output

Implementation: avx2
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
sort.c: sort.c:940:40: error: always_inline function '_mm256_set1_epi32' requires target feature 'avx', but would be inlined into function 'int32_sort' that is compiled without support for 'avx'
sort.c: for (i = q>>3;i < q>>2;++i) y[i] = _mm256_set1_epi32(0x7fffffff);
sort.c: ^
sort.c: sort.c:956:22: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'int32_sort' that is compiled without support for 'avx'
sort.c: int32x8 x0 = int32x8_load(&x[i]);
sort.c: ^
sort.c: sort.c:9:25: note: expanded from macro 'int32x8_load'
sort.c: #define int32x8_load(z) _mm256_loadu_si256((__m256i *) (z))
sort.c: ^
sort.c: sort.c:957:22: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'int32_sort' that is compiled without support for 'avx'
sort.c: int32x8 x1 = int32x8_load(&x[i+q]);
sort.c: ^
sort.c: sort.c:9:25: note: expanded from macro 'int32x8_load'
sort.c: #define int32x8_load(z) _mm256_loadu_si256((__m256i *) (z))
sort.c: ^
sort.c: sort.c:958:22: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'int32_sort' that is compiled without support for 'avx'
sort.c: int32x8 x2 = int32x8_load(&x[i+2*q]);
sort.c: ^
sort.c: sort.c:9:25: note: expanded from macro 'int32x8_load'
sort.c: #define int32x8_load(z) _mm256_loadu_si256((__m256i *) (z))
sort.c: ^
sort.c: sort.c:959:22: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'int32_sort' that is compiled without support for 'avx'
sort.c: int32x8 x3 = int32x8_load(&x[i+3*q]);
sort.c: ^
sort.c: sort.c:9:25: note: expanded from macro 'int32x8_load'
sort.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2

Compiler output

Implementation: T:herf
Security model: timingleaks
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
sort.c: sort.c:133:17: warning: passing 'int32 [n]' to parameter of type 'uint32 *' (aka 'unsigned int *') converts between pointers to integer types with different sign [-Wpointer-sign]
sort.c: RadixSort11(x,y,n);
sort.c: ^
sort.c: sort.c:47:48: note: passing argument to parameter 'sort' here
sort.c: static void RadixSort11(uint32 *array, uint32 *sort, uint32 elements)
sort.c: ^
sort.c: 1 warning generated.

Number of similar (compiler,implementation) pairs: 5, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:herf
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:herf
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:herf
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:herf
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:herf

Compiler output

Implementation: oldavx2
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
int32_sort.c: int32_sort.c:330:15: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'minmax8' that is compiled without support for 'avx'
int32_sort.c: __m256i a = _mm256_loadu_si256((__m256i *) x);
int32_sort.c: ^
int32_sort.c: int32_sort.c:331:15: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'minmax8' that is compiled without support for 'avx'
int32_sort.c: __m256i b = _mm256_loadu_si256((__m256i *) y);
int32_sort.c: ^
int32_sort.c: int32_sort.c:332:37: error: always_inline function '_mm256_min_epi32' requires target feature 'avx2', but would be inlined into function 'minmax8' that is compiled without support for 'avx2'
int32_sort.c: _mm256_storeu_si256((__m256i *) x,_mm256_min_epi32(a,b));
int32_sort.c: ^
int32_sort.c: int32_sort.c:332:3: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'minmax8' that is compiled without support for 'avx'
int32_sort.c: _mm256_storeu_si256((__m256i *) x,_mm256_min_epi32(a,b));
int32_sort.c: ^
int32_sort.c: int32_sort.c:333:37: error: always_inline function '_mm256_max_epi32' requires target feature 'avx2', but would be inlined into function 'minmax8' that is compiled without support for 'avx2'
int32_sort.c: _mm256_storeu_si256((__m256i *) y,_mm256_max_epi32(a,b));
int32_sort.c: ^
int32_sort.c: int32_sort.c:333:3: error: always_inline function '_mm256_storeu_si256' requires target feature 'avx', but would be inlined into function 'minmax8' that is compiled without support for 'avx'
int32_sort.c: _mm256_storeu_si256((__m256i *) y,_mm256_max_epi32(a,b));
int32_sort.c: ^
int32_sort.c: int32_sort.c:364:34: error: always_inline function '_mm_min_epi32' requires target feature 'sse4.1', but would be inlined into function 'minmax4' that is compiled without support for 'sse4.1'
int32_sort.c: _mm_storeu_si128((__m128i *) x,_mm_min_epi32(a,b));
int32_sort.c: ^
int32_sort.c: int32_sort.c:365:34: error: always_inline function '_mm_max_epi32' requires target feature 'sse4.1', but would be inlined into function 'minmax4' that is compiled without support for 'sse4.1'
int32_sort.c: _mm_storeu_si128((__m128i *) y,_mm_max_epi32(a,b));
int32_sort.c: ^
int32_sort.c: int32_sort.c:16:15: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'minmax02through1315' that is compiled without support for 'avx'
int32_sort.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE oldavx2