Implementation notes: amd64, bolero, crypto_kem/ntrulpr4591761
Computer: bolero
Microarchitecture: amd64; Broadwell+AES (406f1)
Architecture: amd64
CPU ID: GenuineIntel-000406f1-1fc9cbf5
SUPERCOP version: 20240625
Operation: crypto_kem
Primitive: ntrulpr4591761
Time | Object size | Test size | Implementation | Compiler | Benchmark date | SUPERCOP version |
184952 | 31113 0 0 | 73228 824 1576 | T:avx | clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240629 | 20240625 |
185040 | 25860 0 0 | 67700 824 1576 | T:avx | clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240629 | 20240625 |
209624 | 14723 0 0 | 53398 816 1640 | T:avx | clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240629 | 20240625 |
211404 | 13283 0 0 | 51524 824 1576 | T:avx | clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240629 | 20240625 |
54835036 | 22739 0 0 | 64780 824 1576 | T:ref | clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240629 | 20240625 |
54839176 | 19034 0 0 | 60820 824 1576 | T:ref | clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240629 | 20240625 |
54851612 | 4647 0 0 | 43606 784 1608 | T:ref | gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240629 | 20240625 |
54875984 | 11181 0 0 | 51716 824 1576 | T:ref | clang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240629 | 20240625 |
54893372 | 5909 0 0 | 44670 816 1640 | T:ref | clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240629 | 20240625 |
54895028 | 4136 0 0 | 41478 776 1576 | T:ref | gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240629 | 20240625 |
55045868 | 3855 0 0 | 42140 824 1576 | T:ref | clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240629 | 20240625 |
55052176 | 26085 0 0 | 66238 784 1608 | T:ref | gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240629 | 20240625 |
55153312 | 4592 0 0 | 42958 784 1608 | T:ref | gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall | 20240629 | 20240625 |
Compiler output
mult.c: mult.c:147:22: error: invalid output size for constraint '=&x'
mult.c: MULSTEP_fromzero(0,h0,h1,h2,h3,h4)
mult.c: ^
mult.c: mult.c:149:26: error: invalid output size for constraint '+x'
mult.c: MULSTEP_noload(j + 1,h1,h2,h3,h4,h0)
mult.c: ^
mult.c: mult.c:150:26: error: invalid output size for constraint '+x'
mult.c: MULSTEP_noload(j + 2,h2,h3,h4,h0,h1)
mult.c: ^
mult.c: mult.c:151:26: error: invalid output size for constraint '+x'
mult.c: MULSTEP_noload(j + 3,h3,h4,h0,h1,h2)
mult.c: ^
mult.c: mult.c:152:26: error: invalid output size for constraint '+x'
mult.c: MULSTEP_noload(j + 4,h4,h0,h1,h2,h3)
mult.c: ^
mult.c: mult.c:153:26: error: invalid output size for constraint '+x'
mult.c: MULSTEP_noload(j + 5,h0,h1,h2,h3,h4)
mult.c: ^
mult.c: mult.c:155:24: error: invalid output size for constraint '+x'
mult.c: MULSTEP_noload(j + 1,h1,h2,h3,h4,h0)
mult.c: ^
mult.c: mult.c:156:24: error: invalid output size for constraint '+x'
mult.c: MULSTEP_noload(j + 2,h2,h3,h4,h0,h1)
mult.c: ^
mult.c: mult.c:157:24: error: invalid output size for constraint '+x'
mult.c: ...
Number of similar (implementation,compiler) pairs: 1, namely:
Implementation | Compiler |
T:avx | clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0) |
Compiler output
mult.c: In function 'mult768_mix2_m256i',
mult.c: inlined from 'rq_mult' at mult.c:722:3:
mult.c: mult.c:568:3: warning: 'mult96x16' accessing 6144 bytes in a region of size 512 [-Wstringop-overflow=]
mult.c: 568 | mult96x16(hkara[12],fkara[6],(__m256i *) (1 + (__m128i *) gkara));
mult.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mult.c: mult.c: In function 'rq_mult':
mult.c: mult.c:568:3: note: referencing argument 1 of type '__m256i *'
mult.c: In function 'mult768_mix2_m256i',
mult.c: inlined from 'rq_mult' at mult.c:722:3:
mult.c: mult.c:568:3: warning: 'mult96x16' reading 3072 bytes from a region of size 512 [-Wstringop-overread]
mult.c: mult.c: In function 'rq_mult':
mult.c: mult.c:568:3: note: referencing argument 2 of type 'const __m256i *'
mult.c: In function 'mult768_mix2_m256i',
mult.c: inlined from 'rq_mult' at mult.c:722:3:
mult.c: mult.c:568:3: warning: 'mult96x16' reading 3072 bytes from a region of size 3056 [-Wstringop-overread]
mult.c: mult.c: In function 'rq_mult':
mult.c: mult.c:568:3: note: referencing argument 3 of type 'const __m256i *'
mult.c: mult.c:279:13: note: in a call to function 'mult96x16'
mult.c: 279 | static void mult96x16(__m256i h[192],const __m256i f[96],const __m256i g[96])
mult.c: | ^~~~~~~~~
mult.c: In function 'mult768_mix2_m256i',
mult.c: inlined from 'rq_mult' at mult.c:722:3:
mult.c: mult.c:569:3: warning: 'mult96x16' accessing 6144 bytes in a region of size 512 [-Wstringop-overflow=]
mult.c: 569 | mult96x16(hkara[0],fkara[0],gkara[0]);
mult.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
mult.c: ...
Number of similar (implementation,compiler) pairs: 4, namely:
Implementation | Compiler |
T:avx | gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0) |
T:avx | gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0) |
T:avx | gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0) |
T:avx | gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0) |