Implementation notes: amd64, comet, crypto_kem/ntrulpr4591761
Computer: comet
Microarchitecture: amd64; Comet Lake (806ec)
Architecture: amd64
CPU ID: GenuineIntel-000806ec-bfebfbff
SUPERCOP version: 20240625
Operation: crypto_kem
Primitive: ntrulpr4591761
Time | Object size | Test size | Implementation | Compiler | Benchmark date | SUPERCOP version |
166415 | 31615 0 0 | 76521 860 1792 | T:avx | clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240711 | 20240625 |
167123 | 26402 0 0 | 71113 860 1728 | T:avx | clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240711 | 20240625 |
192703 | 13385 0 0 | 54513 860 1728 | T:avx | clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240711 | 20240625 |
193874 | 15272 0 0 | 56935 852 1792 | T:avx | clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240711 | 20240625 |
54621862 | 4787 0 0 | 46292 788 1760 | T:ref | gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240711 | 20240625 |
54642585 | 23468 0 0 | 66988 788 1760 | T:ref | gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240711 | 20240625 |
54659312 | 3369 0 0 | 43276 780 1728 | T:ref | gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240711 | 20240625 |
54677328 | 6437 0 0 | 48047 852 1792 | T:ref | clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240711 | 20240625 |
54697433 | 11181 0 0 | 54545 860 1728 | T:ref | clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240711 | 20240625 |
54716905 | 3826 0 0 | 44740 788 1760 | T:ref | gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240711 | 20240625 |
54721629 | 3862 0 0 | 44969 860 1728 | T:ref | clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240711 | 20240625 |
54732895 | 19526 0 0 | 64057 860 1728 | T:ref | clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240711 | 20240625 |
54741094 | 23231 0 0 | 68145 860 1792 | T:ref | clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240711 | 20240625 |
Compiler output
mult.c: mult.c:147:22: error: invalid output size for constraint '=&x'
mult.c: MULSTEP_fromzero(0,h0,h1,h2,h3,h4)
mult.c: ^
mult.c: mult.c:149:26: error: invalid output size for constraint '+x'
mult.c: MULSTEP_noload(j + 1,h1,h2,h3,h4,h0)
mult.c: ^
mult.c: mult.c:150:26: error: invalid output size for constraint '+x'
mult.c: MULSTEP_noload(j + 2,h2,h3,h4,h0,h1)
mult.c: ^
mult.c: mult.c:151:26: error: invalid output size for constraint '+x'
mult.c: MULSTEP_noload(j + 3,h3,h4,h0,h1,h2)
mult.c: ^
mult.c: mult.c:152:26: error: invalid output size for constraint '+x'
mult.c: MULSTEP_noload(j + 4,h4,h0,h1,h2,h3)
mult.c: ^
mult.c: mult.c:153:26: error: invalid output size for constraint '+x'
mult.c: MULSTEP_noload(j + 5,h0,h1,h2,h3,h4)
mult.c: ^
mult.c: mult.c:155:24: error: invalid output size for constraint '+x'
mult.c: MULSTEP_noload(j + 1,h1,h2,h3,h4,h0)
mult.c: ^
mult.c: mult.c:156:24: error: invalid output size for constraint '+x'
mult.c: MULSTEP_noload(j + 2,h2,h3,h4,h0,h1)
mult.c: ^
mult.c: mult.c:157:24: error: invalid output size for constraint '+x'
mult.c: ...
Number of similar (implementation,compiler) pairs: 1, namely:
Implementation | Compiler |
T:avx | clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Debian_Clang_14.0.6) |
Compiler output
mult.c: mult.c: In function 'mult768_mix2_m256i':
mult.c: mult.c:568:3: warning: 'mult96x16' accessing 6144 bytes in a region of size 512 [-Wstringop-overflow=]
mult.c: 568 | mult96x16(hkara[12],fkara[6],(__m256i *) (1 + (__m128i *) gkara));
mult.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mult.c: mult.c:568:3: note: referencing argument 1 of type '__m256i[192]'
mult.c: mult.c:568:3: warning: 'mult96x16' reading 3072 bytes from a region of size 512 [-Wstringop-overread]
mult.c: mult.c:568:3: note: referencing argument 2 of type 'const __m256i[96]'
mult.c: mult.c:568:3: warning: 'mult96x16' reading 3072 bytes from a region of size 3056 [-Wstringop-overread]
mult.c: mult.c:568:3: note: referencing argument 3 of type 'const __m256i[96]'
mult.c: mult.c:279:13: note: in a call to function 'mult96x16'
mult.c: 279 | static void mult96x16(__m256i h[192],const __m256i f[96],const __m256i g[96])
mult.c: | ^~~~~~~~~
mult.c: mult.c:569:3: warning: 'mult96x16' accessing 6144 bytes in a region of size 512 [-Wstringop-overflow=]
mult.c: 569 | mult96x16(hkara[0],fkara[0],gkara[0]);
mult.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mult.c: mult.c:569:3: note: referencing argument 1 of type '__m256i[192]'
mult.c: mult.c:569:3: warning: 'mult96x16' reading 3072 bytes from a region of size 512 [-Wstringop-overread]
mult.c: mult.c:569:3: note: referencing argument 2 of type 'const __m256i[96]'
mult.c: mult.c:569:3: warning: 'mult96x16' reading 3072 bytes from a region of size 1024 [-Wstringop-overread]
mult.c: mult.c:569:3: note: referencing argument 3 of type 'const __m256i[96]'
mult.c: mult.c:279:13: note: in a call to function 'mult96x16'
mult.c: 279 | static void mult96x16(__m256i h[192],const __m256i f[96],const __m256i g[96])
mult.c: | ^~~~~~~~~
mult.c: In function 'mult768_mix2_m256i',
mult.c: inlined from 'rq_mult' at mult.c:722:3:
mult.c: ...
Number of similar (implementation,compiler) pairs: 4, namely:
Implementation | Compiler |
T:avx | gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0) |
T:avx | gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0) |
T:avx | gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0) |
T:avx | gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (12.2.0) |