Implementation notes: amd64, wolfdale, crypto_kem/ntruhps4096821

Computer: wolfdale
Microarchitecture: amd64; Core 2 45nm (1067a)
Architecture: amd64
CPU ID: GenuineIntel-0001067a-bfebfbff
SUPERCOP version: 20240107
Operation: crypto_kem
Primitive: ntruhps4096821
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
389499135962 0 0129336 820 1720T:compactclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010720240107
401356123920 0 0116472 820 1720T:compactclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010720240107
410864335962 0 0135344 820 1720compactclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010720240107
411773623920 0 0122480 820 1720compactclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010720240107
433825233426 0 0132608 820 1720compactclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010720240107
453454533426 0 0126600 820 1720T:compactclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010720240107
138316255550 0 096086 812 1720T:refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
1384307018017 0 0111576 820 1720T:refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
1384673112753 0 0105480 820 1720T:refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
1388484118017 0 0117584 820 1720refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
1388557812753 0 0111488 820 1720refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
138965145550 0 0102094 812 1720refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
1424105817860 0 0111176 820 1720T:refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
1431979217860 0 0117120 820 1720refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
143506364840 0 095486 812 1720T:compactclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010720240107
144060384840 0 0101494 812 1720compactclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010720240107
167613736352 0 097558 812 1720T:refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
168315016352 0 0103582 812 1720refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122820231212
172987435808 0 097166 812 1720T:compactclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010720240107
173612735808 0 0103190 812 1720compactclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2024010720240107

Compiler output

Implementation: avx2
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
poly_s3_inv.c: poly_s3_inv.c:416:11: error: always_inline function '_mm256_set_epi32' requires target feature 'avx', but would be inlined into function '__poly_S3_inv' that is compiled without support for 'avx'
poly_s3_inv.c: F0[0] = _mm256_set_epi32(-1,-1,-1,-1,-1,-1,-1,-1);
poly_s3_inv.c: ^
poly_s3_inv.c: poly_s3_inv.c:416:11: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly_s3_inv.c: poly_s3_inv.c:417:11: error: always_inline function '_mm256_set_epi32' requires target feature 'avx', but would be inlined into function '__poly_S3_inv' that is compiled without support for 'avx'
poly_s3_inv.c: F0[1] = _mm256_set_epi32(-1,-1,-1,-1,-1,-1,-1,-1);
poly_s3_inv.c: ^
poly_s3_inv.c: poly_s3_inv.c:417:11: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly_s3_inv.c: poly_s3_inv.c:418:11: error: always_inline function '_mm256_set_epi32' requires target feature 'avx', but would be inlined into function '__poly_S3_inv' that is compiled without support for 'avx'
poly_s3_inv.c: F0[2] = _mm256_set_epi32(-1,-1,-1,-1,-1,-1,-1,-1);
poly_s3_inv.c: ^
poly_s3_inv.c: poly_s3_inv.c:418:11: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly_s3_inv.c: poly_s3_inv.c:419:11: error: always_inline function '_mm256_set_epi32' requires target feature 'avx', but would be inlined into function '__poly_S3_inv' that is compiled without support for 'avx'
poly_s3_inv.c: F0[3] = _mm256_set_epi32(0,8191,0,8191,0,8191,0,16383);
poly_s3_inv.c: ^
poly_s3_inv.c: poly_s3_inv.c:419:11: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly_s3_inv.c: poly_s3_inv.c:420:11: error: always_inline function '_mm256_set1_epi32' requires target feature 'avx', but would be inlined into function '__poly_S3_inv' that is compiled without support for 'avx'
poly_s3_inv.c: F1[0] = _mm256_set1_epi32(0);
poly_s3_inv.c: ^
poly_s3_inv.c: poly_s3_inv.c:420:11: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly_s3_inv.c: poly_s3_inv.c:421:11: error: always_inline function '_mm256_set1_epi32' requires target feature 'avx', but would be inlined into function '__poly_S3_inv' that is compiled without support for 'avx'
poly_s3_inv.c: F1[1] = _mm256_set1_epi32(0);
poly_s3_inv.c: ^
poly_s3_inv.c: poly_s3_inv.c:421:11: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly_s3_inv.c: poly_s3_inv.c:422:11: error: always_inline function '_mm256_set1_epi32' requires target feature 'avx', but would be inlined into function '__poly_S3_inv' that is compiled without support for 'avx'
poly_s3_inv.c: ...

Number of similar (compiler,implementation) pairs: 10, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2 T:avx2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2 T:avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2 T:avx2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2 T:avx2
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2 T:avx2

Compiler output

Implementation: avx2
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
poly_s3_inv.c: poly_s3_inv.c: In function '__poly_S3_inv':
poly_s3_inv.c: poly_s3_inv.c:416:9: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
poly_s3_inv.c: 416 | F0[0] = _mm256_set_epi32(-1,-1,-1,-1,-1,-1,-1,-1);
poly_s3_inv.c: | ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
poly_s3_inv.c: poly_s3_inv.c: In function 'vec256_swap':
poly_s3_inv.c: poly_s3_inv.c:176:20: note: the ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
poly_s3_inv.c: 176 | static inline void vec256_swap(vec256 *f,vec256 *g,int len,vec256 mask)
poly_s3_inv.c: | ^~~~~~~~~~~
poly_s3_inv.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:53,
poly_s3_inv.c: from poly.h:4,
poly_s3_inv.c: from poly_s3_inv.c:1:
poly_s3_inv.c: poly_s3_inv.c: In function 'vec256_frombits':
poly_s3_inv.c: /usr/lib/gcc/x86_64-linux-gnu/10/include/avx2intrin.h:597:1: error: inlining failed in call to 'always_inline' '_mm256_shuffle_epi32': target specific option mismatch
poly_s3_inv.c: 597 | _mm256_shuffle_epi32 (__m256i __A, const int __mask)
poly_s3_inv.c: | ^~~~~~~~~~~~~~~~~~~~
poly_s3_inv.c: poly_s3_inv.c:67:9: note: called from here
poly_s3_inv.c: 67 | h = _mm256_shuffle_epi32(h,0xd8);
poly_s3_inv.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
poly_s3_inv.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:53,
poly_s3_inv.c: from poly.h:4,
poly_s3_inv.c: from poly_s3_inv.c:1:
poly_s3_inv.c: /usr/lib/gcc/x86_64-linux-gnu/10/include/avx2intrin.h:1068:1: error: inlining failed in call to 'always_inline' '_mm256_permute4x64_epi64': target specific option mismatch
poly_s3_inv.c: 1068 | _mm256_permute4x64_epi64 (__m256i __X, const int __M)
poly_s3_inv.c: | ^~~~~~~~~~~~~~~~~~~~~~~~
poly_s3_inv.c: poly_s3_inv.c:66:9: note: called from here
poly_s3_inv.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2 T:avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2 T:avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2 T:avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2 T:avx2

Compiler output

Implementation: compact
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
try.c: /usr/bin/ld: /home/supercop/benchmarking/supercop-20240107/supercop-data/wolfdale/amd64/lib/libsupercop.a(crypto_core_keccakf160064bits_optimized1600AsmX86_64_constbranchindex-KeccakP-1600-x86-64-gas.o): relocation R_X86_64_32S against `.text' can not be used when making a PIE object; recompile with -fPIE
try.c: collect2: error: ld returned 1 exit status

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE compact T:compact
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE compact T:compact
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE compact T:compact
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE compact T:compact

Compiler output

Implementation: ref
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
try.c: /usr/bin/ld: /home/supercop/benchmarking/supercop-20231212/supercop-data/wolfdale/amd64/lib/libsupercop.a(crypto_core_keccakf160064bits_optimized1600AsmX86_64_constbranchindex-KeccakP-1600-x86-64-gas.o): relocation R_X86_64_32S against `.text' can not be used when making a PIE object; recompile with -fPIE
try.c: collect2: error: ld returned 1 exit status

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE ref T:ref
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE ref T:ref
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE ref T:ref
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE ref T:ref