Implementation notes: amd64, speed2supercop, crypto_kem/ntskem1380

Computer: speed2supercop
Microarchitecture: amd64; Haswell+AES (306c3)
Architecture: amd64
CPU ID: GenuineIntel-000306c3-1fc9cbf5
SUPERCOP version: 20240625
Operation: crypto_kem
Primitive: ntskem1380
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
104476883589 84 16101419 904 1632T:avx2gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
1052932105314 84 16125139 904 1632T:avx2gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
107854880925 84 1698487 904 1632T:avx2gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
1120024158581 84 16179031 928 1568T:avx2clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
120533276813 84 1693350 896 1600T:avx2gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
135742498036 84 16118967 928 1568T:sse2clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
1372436149930 84 16170447 928 1568T:sse2clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
1479604128476 84 16150511 928 1568T:optclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
150595286557 84 16108011 904 1632T:optgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
150884085830 84 16103208 928 1568T:sse2clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
153230869624 84 1689123 904 1632T:optgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
155392073513 84 1691617 920 1664T:sse2clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
157562893885 84 16116327 928 1568T:optclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
162727266853 84 1686007 904 1632T:optgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
168875269151 84 1688489 920 1664T:optclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
170923680606 84 1699552 928 1568T:optclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
172309663762 84 1681670 896 1600T:optgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
1787348125259 84 16146568 928 1568T:optclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
1306660033533 76 1654971 872 1632T:refgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
1406025225606 76 1645099 872 1632T:refgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
1443307243619 76 1666520 912 1568T:refclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
1450520441750 76 1664503 912 1568T:refclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
1492321222697 76 1641799 872 1632T:refgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
1521495237238 76 1658776 912 1568T:refclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
1612915219823 76 1637726 864 1600T:refgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
1665828429990 76 1648904 912 1568T:refclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625
1686824822986 76 1642577 904 1664T:refclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024071220240625

Test failure


error 111

Number of similar (implementation,compiler) pairs: 3, namely:
ImplementationCompiler
T:avx2clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:avx2clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:avx2clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))

Compiler output


nts_kem.c: nts_kem.c:112:27: warning: passing arguments to 'ff_create' without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
nts_kem.c:     priv->ff2m = ff_create(priv->m);
nts_kem.c:                           ^
nts_kem.c: nts_kem.c:255:27: warning: passing arguments to 'ff_create' without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
nts_kem.c:     priv->ff2m = ff_create(priv->m);
nts_kem.c:                           ^
nts_kem.c: 2 warnings generated.

Number of similar (implementation,compiler) pairs: 8, namely:
ImplementationCompiler
T:avx2clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:avx2clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:avx2clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:avx2clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:sse2clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:sse2clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:sse2clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:sse2clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))

Compiler output


bitslice_bma_128.c: bitslice_bma_128.c:92:12: error: always_inline function '_mm_shuffle_epi8' requires target feature 'ssse3', but would be inlined into function 'bit_reverse' that is compiled without support for 'ssse3'
bitslice_bma_128.c:     return _mm_shuffle_epi8(x, _mm_set_epi8(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15));
bitslice_bma_128.c:            ^
bitslice_bma_128.c: 1 error generated.

Number of similar (implementation,compiler) pairs: 2, namely:
ImplementationCompiler
T:avx2clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:sse2clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))

Compiler output


nts_kem.c: nts_kem.c:106:27: warning: passing arguments to 'ff_create' without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
nts_kem.c:     priv->ff2m = ff_create(priv->m);
nts_kem.c:                           ^
nts_kem.c: nts_kem.c:228:27: warning: passing arguments to 'ff_create' without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
nts_kem.c:     priv->ff2m = ff_create(priv->m);
nts_kem.c:                           ^
nts_kem.c: 2 warnings generated.

Number of similar (implementation,compiler) pairs: 5, namely:
ImplementationCompiler
T:optclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:optclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:optclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:optclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:optclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))

Compiler output


nts_kem.c: nts_kem.c:106:27: warning: passing arguments to 'ff_create' without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
nts_kem.c:     priv->ff2m = ff_create(priv->m);
nts_kem.c:                           ^
nts_kem.c: nts_kem.c:242:27: warning: passing arguments to 'ff_create' without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
nts_kem.c:     priv->ff2m = ff_create(priv->m);
nts_kem.c:                           ^
nts_kem.c: 2 warnings generated.

Number of similar (implementation,compiler) pairs: 5, namely:
ImplementationCompiler
T:refclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:refclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:refclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:refclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))
T:refclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_16.0.6_(27+b1))

Compiler output


m4r.c: m4r.c: In function 'zero_vector':
m4r.c: m4r.c:85:20: error: incompatible types when assigning to type 'vector' {aka '__m128i'} from type '__m256i'
m4r.c:    85 |         *vec_ptr = _mm256_setzero_si256(); vec_ptr++;
m4r.c:       |                    ^~~~~~~~~~~~~~~~~~~~
m4r.c: m4r.c:86:20: error: incompatible types when assigning to type 'vector' {aka '__m128i'} from type '__m256i'
m4r.c:    86 |         *vec_ptr = _mm256_setzero_si256(); vec_ptr++;
m4r.c:       |                    ^~~~~~~~~~~~~~~~~~~~
m4r.c: m4r.c: In function '_m4ri_make_table_rev':
m4r.c: m4r.c:147:12: error: incompatible types when assigning to type 'vector' {aka '__m128i'} from type '__m256i'
m4r.c:   147 |     mask = _mm256_set_epi64x(v[3], v[2], v[1], v[0]);
m4r.c:       |            ^~~~~~~~~~~~~~~~~
m4r.c: m4r.c:196:46: error: incompatible type for argument 1 of '_mm256_and_si256'
m4r.c:   196 |     S_ptr[nblocks-1] = _mm256_and_si256(S_ptr[nblocks-1], mask);
m4r.c:       |                                         ~~~~~^~~~~~~~~~~
m4r.c:       |                                              |
m4r.c:       |                                              vector {aka __m128i}
m4r.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/13/include/immintrin.h:51,
m4r.c:                  from bits.h:28,
m4r.c:                  from m4r.c:26:
m4r.c: /usr/lib/gcc/x86_64-linux-gnu/13/include/avx2intrin.h:179:27: note: expected '__m256i' but argument is of type 'vector' {aka '__m128i'}
m4r.c:   179 | _mm256_and_si256 (__m256i __A, __m256i __B)
m4r.c:       |                   ~~~~~~~~^~~
m4r.c: m4r.c:196:59: error: incompatible type for argument 2 of '_mm256_and_si256'
m4r.c:   196 |     S_ptr[nblocks-1] = _mm256_and_si256(S_ptr[nblocks-1], mask);
m4r.c:       |                                                           ^~~~
m4r.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:sse2gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)
T:sse2gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)
T:sse2gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)
T:sse2gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (13.3.0)