Implementation notes: amd64, kizomba, crypto_kem/ntrulpr4591761

Computer: kizomba
Microarchitecture: amd64; Kaby Lake (906e9)
Architecture: amd64
CPU ID: GenuineIntel-000906e9-1fc9cbf5
SUPERCOP version: 20240625
Operation: crypto_kem
Primitive: ntrulpr4591761
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
16116031615 0 074908 824 1640T:avxclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
16168226402 0 069372 824 1576T:avxclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
18661713385 0 052836 824 1576T:avxclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
18737115272 0 055198 816 1640T:avxclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
546507924646 0 044790 784 1608T:refgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
5466208027707 0 069310 784 1608T:refgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
5468240119526 0 062348 824 1576T:refclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
546933534135 0 042662 776 1576T:refgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
547058433862 0 043324 824 1576T:refclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
547426194592 0 044270 784 1608T:refgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
5501652011181 0 052900 824 1576T:refclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
5515970823231 0 066436 824 1640T:refclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625
551897266437 0 046342 816 1640T:refclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024062920240625

Compiler output


mult.c: mult.c:147:22: error: invalid output size for constraint '=&x'
mult.c:   MULSTEP_fromzero(0,h0,h1,h2,h3,h4)
mult.c:                      ^
mult.c: mult.c:149:26: error: invalid output size for constraint '+x'
mult.c:     MULSTEP_noload(j + 1,h1,h2,h3,h4,h0)
mult.c:                          ^
mult.c: mult.c:150:26: error: invalid output size for constraint '+x'
mult.c:     MULSTEP_noload(j + 2,h2,h3,h4,h0,h1)
mult.c:                          ^
mult.c: mult.c:151:26: error: invalid output size for constraint '+x'
mult.c:     MULSTEP_noload(j + 3,h3,h4,h0,h1,h2)
mult.c:                          ^
mult.c: mult.c:152:26: error: invalid output size for constraint '+x'
mult.c:     MULSTEP_noload(j + 4,h4,h0,h1,h2,h3)
mult.c:                          ^
mult.c: mult.c:153:26: error: invalid output size for constraint '+x'
mult.c:     MULSTEP_noload(j + 5,h0,h1,h2,h3,h4)
mult.c:                          ^
mult.c: mult.c:155:24: error: invalid output size for constraint '+x'
mult.c:   MULSTEP_noload(j + 1,h1,h2,h3,h4,h0)
mult.c:                        ^
mult.c: mult.c:156:24: error: invalid output size for constraint '+x'
mult.c:   MULSTEP_noload(j + 2,h2,h3,h4,h0,h1)
mult.c:                        ^
mult.c: mult.c:157:24: error: invalid output size for constraint '+x'
mult.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:avxclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Compiler output


mult.c: In function 'mult768_mix2_m256i',
mult.c:     inlined from 'rq_mult' at mult.c:722:3:
mult.c: mult.c:568:3: warning: 'mult96x16' accessing 6144 bytes in a region of size 512 [-Wstringop-overflow=]
mult.c:   568 |   mult96x16(hkara[12],fkara[6],(__m256i *) (1 + (__m128i *) gkara));
mult.c:       |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mult.c: mult.c: In function 'rq_mult':
mult.c: mult.c:568:3: note: referencing argument 1 of type '__m256i *'
mult.c: In function 'mult768_mix2_m256i',
mult.c:     inlined from 'rq_mult' at mult.c:722:3:
mult.c: mult.c:568:3: warning: 'mult96x16' reading 3072 bytes from a region of size 512 [-Wstringop-overread]
mult.c: mult.c: In function 'rq_mult':
mult.c: mult.c:568:3: note: referencing argument 2 of type 'const __m256i *'
mult.c: In function 'mult768_mix2_m256i',
mult.c:     inlined from 'rq_mult' at mult.c:722:3:
mult.c: mult.c:568:3: warning: 'mult96x16' reading 3072 bytes from a region of size 3056 [-Wstringop-overread]
mult.c: mult.c: In function 'rq_mult':
mult.c: mult.c:568:3: note: referencing argument 3 of type 'const __m256i *'
mult.c: mult.c:279:13: note: in a call to function 'mult96x16'
mult.c:   279 | static void mult96x16(__m256i h[192],const __m256i f[96],const __m256i g[96])
mult.c:       |             ^~~~~~~~~
mult.c: In function 'mult768_mix2_m256i',
mult.c:     inlined from 'rq_mult' at mult.c:722:3:
mult.c: mult.c:569:3: warning: 'mult96x16' accessing 6144 bytes in a region of size 512 [-Wstringop-overflow=]
mult.c:   569 |   mult96x16(hkara[0],fkara[0],gkara[0]);
mult.c:       |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mult.c: ...

Number of similar (implementation,compiler) pairs: 4, namely:
ImplementationCompiler
T:avxgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avxgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avxgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:avxgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)