Implementation notes: amd64, titan0, crypto_kem/ntskem1380

Computer: titan0
Microarchitecture: amd64; Haswell+AES (306c3)
Architecture: amd64
CPU ID: GenuineIntel-000306c3-bfebfbff
SUPERCOP version: 20240107
Operation: crypto_kem
Primitive: ntskem1380
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
1033498102261 84 16122855 956 1792T:avx2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
108225084032 84 16102751 956 1792T:avx2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
108279281807 84 16100015 956 1792T:avx2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
1110066168163 84 16189282 996 1728T:avx2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1148374104063 84 16125522 996 1728T:avx2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
119454277990 84 1695062 948 1760T:avx2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
1315384102152 84 16120170 996 1728T:avx2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
132999476445 84 1695052 988 1824T:avx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1360066158675 84 16179786 996 1728T:sse2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
136842096822 84 16118354 996 1728T:sse2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1401404146389 84 16169114 996 1728T:optclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
154142895638 84 16113666 996 1728T:sse2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
154411489776 84 16112063 956 1792T:optgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
155349669293 84 1689671 956 1792T:optgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
155368670613 84 1689252 988 1824T:sse2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
160173194228 84 16117290 996 1728T:optclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
164964666880 84 1686775 956 1792T:optgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
171449065740 84 1685724 988 1824T:optclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
171590864320 84 1682878 948 1760T:optgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
177607090159 84 16109746 996 1728T:optclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1844473159397 84 16180994 996 1728T:optclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1286485447519 76 1671162 980 1728T:refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1307566144733 76 1668026 980 1728T:refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1316174334713 76 1657087 924 1792T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
1395255639008 76 1660898 980 1728T:refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1428572029623 76 1649154 980 1728T:refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1444320026765 76 1647103 924 1792T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
1444549523005 76 1643148 972 1824T:refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023121520231212
1534136024168 76 1644055 924 1792T:refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212
1591717821145 76 1639630 916 1760T:refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023121520231212

Compiler output

Implementation: T:avx2
Security model: timingleaks
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
bitslice_bma_128.c: bitslice_bma_128.c:92:12: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'bit_reverse' that is compiled without support for 'ssse3'
bitslice_bma_128.c: return _mm_shuffle_epi8(x, _mm_set_epi8(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15));
bitslice_bma_128.c: ^
bitslice_bma_128.c: 1 error generated.

Number of similar (compiler,implementation) pairs: 2, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE T:avx2 T:sse2

Compiler output

Implementation: T:sse2
Security model: timingleaks
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
m4r.c: m4r.c: In function 'zero_vector':
m4r.c: m4r.c:85:20: error: incompatible types when assigning to type 'vector' {aka '__m128i'} from type '__m256i'
m4r.c: 85 | *vec_ptr = _mm256_setzero_si256(); vec_ptr++;
m4r.c: | ^~~~~~~~~~~~~~~~~~~~
m4r.c: m4r.c:86:20: error: incompatible types when assigning to type 'vector' {aka '__m128i'} from type '__m256i'
m4r.c: 86 | *vec_ptr = _mm256_setzero_si256(); vec_ptr++;
m4r.c: | ^~~~~~~~~~~~~~~~~~~~
m4r.c: m4r.c: In function '_m4ri_make_table_rev':
m4r.c: m4r.c:147:12: error: incompatible types when assigning to type 'vector' {aka '__m128i'} from type '__m256i'
m4r.c: 147 | mask = _mm256_set_epi64x(v[3], v[2], v[1], v[0]);
m4r.c: | ^~~~~~~~~~~~~~~~~
m4r.c: m4r.c:196:46: error: incompatible type for argument 1 of '_mm256_and_si256'
m4r.c: 196 | S_ptr[nblocks-1] = _mm256_and_si256(S_ptr[nblocks-1], mask);
m4r.c: | ~~~~~^~~~~~~~~~~
m4r.c: | |
m4r.c: | vector {aka __m128i}
m4r.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/11/include/immintrin.h:47,
m4r.c: from bits.h:28,
m4r.c: from m4r.c:26:
m4r.c: /usr/lib/gcc/x86_64-linux-gnu/11/include/avx2intrin.h:179:27: note: expected '__m256i' but argument is of type 'vector' {aka '__m128i'}
m4r.c: 179 | _mm256_and_si256 (__m256i __A, __m256i __B)
m4r.c: | ~~~~~~~~^~~
m4r.c: m4r.c:196:59: error: incompatible type for argument 2 of '_mm256_and_si256'
m4r.c: 196 | S_ptr[nblocks-1] = _mm256_and_si256(S_ptr[nblocks-1], mask);
m4r.c: | ^~~~
m4r.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sse2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sse2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sse2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE T:sse2