Implementation notes: amd64, comet, crypto_kem/ntskem1380

Computer: comet
Microarchitecture: amd64; Comet Lake (806ec)
Architecture: amd64
CPU ID: GenuineIntel-000806ec-bfebfbff
SUPERCOP version: 20240425
Operation: crypto_kem
Primitive: ntskem1380
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
847340102660 84 16124019 988 1728T:avx2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
92137095470 84 16116979 988 1728T:sse2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
954492165084 84 16186219 988 1824T:sse2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
994298100874 84 16121398 932 1792T:avx2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
1038530102152 84 16120219 988 1728T:avx2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
108102384639 84 16103294 932 1792T:avx2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
108620282133 84 16100222 932 1792T:avx2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
113989876339 84 1695025 980 1824T:avx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
116381976895 84 1693805 924 1760T:avx2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
117199195655 84 16113779 988 1728T:sse2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
117817570532 84 1689289 980 1824T:sse2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
1330003144240 84 16166963 988 1824T:optclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
145238387835 84 16110022 932 1792T:optgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
152656292081 84 16115147 988 1728T:optclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
159466770814 84 1691118 932 1792T:optgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
160544163806 84 1682125 924 1760T:optgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
160728567935 84 1687614 932 1792T:optgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
164764865599 84 1685713 980 1824T:optclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
169312690159 84 16109795 988 1728T:optclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
1794205159397 84 16181043 988 1728T:optclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
1225424342677 76 1666035 972 1728T:refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
1236145433506 76 1655622 900 1792T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
1243965045475 76 1669083 972 1824T:refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
1277917039008 76 1660947 972 1728T:refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
1345936625898 76 1646158 900 1792T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
1389643629623 76 1649203 972 1728T:refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
1415176722892 76 1643121 964 1824T:refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024050820240425
1479642019892 76 1638189 892 1760T:refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425
1488497323074 76 1642726 900 1792T:refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2024050820240425

Test failure

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
error 111

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
bitslice_bma_128.c: bitslice_bma_128.c:92:12: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'bit_reverse' that is compiled without support for 'ssse3'
bitslice_bma_128.c: return _mm_shuffle_epi8(x, _mm_set_epi8(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15));
bitslice_bma_128.c: ^
bitslice_bma_128.c: 1 error generated.

Number of similar (compiler,implementation) pairs: 2, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2 T:sse2

Compiler output

Implementation: T:sse2
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
m4r.c: m4r.c: In function 'zero_vector':
m4r.c: m4r.c:85:20: error: incompatible types when assigning to type 'vector' {aka '__m128i'} from type '__m256i'
m4r.c: 85 | *vec_ptr = _mm256_setzero_si256(); vec_ptr++;
m4r.c: | ^~~~~~~~~~~~~~~~~~~~
m4r.c: m4r.c:86:20: error: incompatible types when assigning to type 'vector' {aka '__m128i'} from type '__m256i'
m4r.c: 86 | *vec_ptr = _mm256_setzero_si256(); vec_ptr++;
m4r.c: | ^~~~~~~~~~~~~~~~~~~~
m4r.c: m4r.c: In function '_m4ri_make_table_rev':
m4r.c: m4r.c:147:12: error: incompatible types when assigning to type 'vector' {aka '__m128i'} from type '__m256i'
m4r.c: 147 | mask = _mm256_set_epi64x(v[3], v[2], v[1], v[0]);
m4r.c: | ^~~~~~~~~~~~~~~~~
m4r.c: m4r.c:196:46: error: incompatible type for argument 1 of '_mm256_and_si256'
m4r.c: 196 | S_ptr[nblocks-1] = _mm256_and_si256(S_ptr[nblocks-1], mask);
m4r.c: | ~~~~~^~~~~~~~~~~
m4r.c: | |
m4r.c: | vector {aka __m128i}
m4r.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/12/include/immintrin.h:47,
m4r.c: from bits.h:28,
m4r.c: from m4r.c:26:
m4r.c: /usr/lib/gcc/x86_64-linux-gnu/12/include/avx2intrin.h:179:27: note: expected '__m256i' but argument is of type 'vector' {aka '__m128i'}
m4r.c: 179 | _mm256_and_si256 (__m256i __A, __m256i __B)
m4r.c: | ~~~~~~~~^~~
m4r.c: m4r.c:196:59: error: incompatible type for argument 2 of '_mm256_and_si256'
m4r.c: 196 | S_ptr[nblocks-1] = _mm256_and_si256(S_ptr[nblocks-1], mask);
m4r.c: | ^~~~
m4r.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sse2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sse2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sse2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sse2