Implementation notes: amd64, jasper, crypto_kem/ntruhps4096821

Computer: jasper
Microarchitecture: amd64; Tremont (906c0)
Architecture: amd64
CPU ID: GenuineIntel-000906c0-20-bfebfbff
SUPERCOP version: 20240107
Operation: crypto_kem
Primitive: ntruhps4096821
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
184010330660 0 053041 876 1720T:compactclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122920231222
185298519591 0 039729 876 1720T:compactclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122920231222
188356630216 0 051976 868 1752T:compactgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122920231222
188415630216 0 059896 868 1752compactgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122920231222
188592930660 0 061025 876 1720compactclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122920231222
190288219591 0 047649 876 1720compactclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122920231222
212386233426 0 054537 876 1720T:compactclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122920231222
218293733426 0 062457 876 1720compactclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122920231222
165234245293 0 023368 860 1720T:refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231212
167418285293 0 031352 860 1720refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231212
174786493729 0 029896 860 1720compactgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122920231222
176087343729 0 021976 860 1720T:compactgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122920231222
1787203117860 0 039049 876 1720T:refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231212
1788961417860 0 047033 876 1720refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231212
1912638618543 0 040048 868 1752T:refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231212
1913995211103 0 031345 876 1720T:refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231212
1914087116383 0 038977 876 1720T:refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231212
191538106236 0 026816 868 1752T:refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231212
191759096420 0 025511 868 1720T:refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231212
191768455760 0 025654 860 1752T:refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231212
1918258818543 0 048032 868 1752refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231212
191835635670 0 024159 868 1720T:refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231212
1919081116383 0 046897 876 1720refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231212
192010346236 0 034800 868 1752refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231212
1920888811103 0 039329 876 1720refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231212
192101086420 0 033495 868 1720refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231212
192324265670 0 032079 868 1720refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122120231212
192403495760 0 033574 860 1752refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122120231212
196769605250 0 025896 868 1752T:compactgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122920231222
196996635866 0 025111 868 1720T:compactclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122920231222
197130945866 0 033095 868 1720compactclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122920231222
197133045250 0 033816 868 1752compactgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122920231222
197137564878 0 023439 868 1720T:compactclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122920231222
197823764878 0 031359 868 1720compactclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2023122920231222
199384524406 0 024422 860 1752T:compactgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122920231222
199886394406 0 032406 860 1752compactgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2023122920231222

Compiler output

Implementation: avx2
Security model: constbranchindex
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
poly_s3_inv.c: poly_s3_inv.c:416:11: error: always_inline function '_mm256_set_epi32' requires target feature 'avx', but would be inlined into function '__poly_S3_inv' that is compiled without support for 'avx'
poly_s3_inv.c: F0[0] = _mm256_set_epi32(-1,-1,-1,-1,-1,-1,-1,-1);
poly_s3_inv.c: ^
poly_s3_inv.c: poly_s3_inv.c:416:11: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly_s3_inv.c: poly_s3_inv.c:417:11: error: always_inline function '_mm256_set_epi32' requires target feature 'avx', but would be inlined into function '__poly_S3_inv' that is compiled without support for 'avx'
poly_s3_inv.c: F0[1] = _mm256_set_epi32(-1,-1,-1,-1,-1,-1,-1,-1);
poly_s3_inv.c: ^
poly_s3_inv.c: poly_s3_inv.c:417:11: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly_s3_inv.c: poly_s3_inv.c:418:11: error: always_inline function '_mm256_set_epi32' requires target feature 'avx', but would be inlined into function '__poly_S3_inv' that is compiled without support for 'avx'
poly_s3_inv.c: F0[2] = _mm256_set_epi32(-1,-1,-1,-1,-1,-1,-1,-1);
poly_s3_inv.c: ^
poly_s3_inv.c: poly_s3_inv.c:418:11: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly_s3_inv.c: poly_s3_inv.c:419:11: error: always_inline function '_mm256_set_epi32' requires target feature 'avx', but would be inlined into function '__poly_S3_inv' that is compiled without support for 'avx'
poly_s3_inv.c: F0[3] = _mm256_set_epi32(0,8191,0,8191,0,8191,0,16383);
poly_s3_inv.c: ^
poly_s3_inv.c: poly_s3_inv.c:419:11: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly_s3_inv.c: poly_s3_inv.c:420:11: error: always_inline function '_mm256_set1_epi32' requires target feature 'avx', but would be inlined into function '__poly_S3_inv' that is compiled without support for 'avx'
poly_s3_inv.c: F1[0] = _mm256_set1_epi32(0);
poly_s3_inv.c: ^
poly_s3_inv.c: poly_s3_inv.c:420:11: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly_s3_inv.c: poly_s3_inv.c:421:11: error: always_inline function '_mm256_set1_epi32' requires target feature 'avx', but would be inlined into function '__poly_S3_inv' that is compiled without support for 'avx'
poly_s3_inv.c: F1[1] = _mm256_set1_epi32(0);
poly_s3_inv.c: ^
poly_s3_inv.c: poly_s3_inv.c:421:11: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
poly_s3_inv.c: poly_s3_inv.c:422:11: error: always_inline function '_mm256_set1_epi32' requires target feature 'avx', but would be inlined into function '__poly_S3_inv' that is compiled without support for 'avx'
poly_s3_inv.c: ...

Number of similar (compiler,implementation) pairs: 10, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2 T:avx2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2 T:avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2 T:avx2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2 T:avx2
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2 T:avx2

Compiler output

Implementation: avx2
Security model: constbranchindex
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
poly_s3_inv.c: poly_s3_inv.c: In function '__poly_S3_inv':
poly_s3_inv.c: poly_s3_inv.c:416:9: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
poly_s3_inv.c: 416 | F0[0] = _mm256_set_epi32(-1,-1,-1,-1,-1,-1,-1,-1);
poly_s3_inv.c: | ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
poly_s3_inv.c: poly_s3_inv.c: In function 'vec256_swap':
poly_s3_inv.c: poly_s3_inv.c:176:20: note: the ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
poly_s3_inv.c: 176 | static inline void vec256_swap(vec256 *f,vec256 *g,int len,vec256 mask)
poly_s3_inv.c: | ^~~~~~~~~~~
poly_s3_inv.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:53,
poly_s3_inv.c: from poly.h:4,
poly_s3_inv.c: from poly_s3_inv.c:1:
poly_s3_inv.c: poly_s3_inv.c: In function 'vec256_frombits':
poly_s3_inv.c: /usr/lib/gcc/x86_64-linux-gnu/10/include/avx2intrin.h:597:1: error: inlining failed in call to 'always_inline' '_mm256_shuffle_epi32': target specific option mismatch
poly_s3_inv.c: 597 | _mm256_shuffle_epi32 (__m256i __A, const int __mask)
poly_s3_inv.c: | ^~~~~~~~~~~~~~~~~~~~
poly_s3_inv.c: poly_s3_inv.c:67:9: note: called from here
poly_s3_inv.c: 67 | h = _mm256_shuffle_epi32(h,0xd8);
poly_s3_inv.c: | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
poly_s3_inv.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:53,
poly_s3_inv.c: from poly.h:4,
poly_s3_inv.c: from poly_s3_inv.c:1:
poly_s3_inv.c: /usr/lib/gcc/x86_64-linux-gnu/10/include/avx2intrin.h:1068:1: error: inlining failed in call to 'always_inline' '_mm256_permute4x64_epi64': target specific option mismatch
poly_s3_inv.c: 1068 | _mm256_permute4x64_epi64 (__m256i __X, const int __M)
poly_s3_inv.c: | ^~~~~~~~~~~~~~~~~~~~~~~~
poly_s3_inv.c: poly_s3_inv.c:66:9: note: called from here
poly_s3_inv.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2 T:avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2 T:avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2 T:avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2 T:avx2