Implementation notes: amd64, sectionthirtyone, crypto_kem/kyber1024

Computer: sectionthirtyone
Architecture: amd64
CPU ID: GenuineIntel-000906e9-bfebfbff
SUPERCOP version: 20191221
Operation: crypto_kem
Primitive: kyber1024
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
205328192459 0 0213016 792 1608avx2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020011220191221
332478143621 0 0163472 784 1576avx2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020011220191221
332612143621 0 0163472 784 1576avx2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020011220191221
351477152802 0 0172656 784 1576avx2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020011220191221
390473126700 0 0142818 776 1576avx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020011220191221
415380125941 0 0144016 792 1608avx2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020011220191221
442658124571 0 0141632 784 1576avx2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020011220191221
481257124877 0 0142848 792 1608avx2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020011220191221
79423934512 512 054152 1312 1576refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020011220191221
83981945933 512 066385 1304 1608refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020011220191221
87907732864 512 052648 1312 1576refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020011220191221
92281512082 512 029865 1304 1608refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020011220191221
102301811326 512 028193 1296 1576refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020011220191221
120495734512 512 054152 1312 1576refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020011220191221
124860546855 512 066576 1312 1576refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020011220191221
150715011427 512 027914 1304 1576refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2020011220191221
160538312658 512 030569 1304 1608refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2020011220191221

Compiler output

Implementation: avx2
Security model: unknown
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
cbd.c: cbd.c:20:26: error: always_inline function '_mm256_set1_epi32' requires target feature 'avx', but would be inlined into function 'cbd' that is compiled without support for 'avx'
cbd.c: const __m256i mask55 = _mm256_set1_epi32(0x55555555);
cbd.c: ^
cbd.c: cbd.c:21:26: error: always_inline function '_mm256_set1_epi32' requires target feature 'avx', but would be inlined into function 'cbd' that is compiled without support for 'avx'
cbd.c: const __m256i mask33 = _mm256_set1_epi32(0x33333333);
cbd.c: ^
cbd.c: cbd.c:22:26: error: always_inline function '_mm256_set1_epi32' requires target feature 'avx', but would be inlined into function 'cbd' that is compiled without support for 'avx'
cbd.c: const __m256i mask03 = _mm256_set1_epi32(0x03030303);
cbd.c: ^
cbd.c: cbd.c:26:12: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'cbd' that is compiled without support for 'avx'
cbd.c: vec0 = _mm256_loadu_si256((__m256i *)&buf[32*i]);
cbd.c: ^
cbd.c: cbd.c:28:12: error: always_inline function '_mm256_srli_epi32' requires target feature 'avx2', but would be inlined into function 'cbd' that is compiled without support for 'avx2'
cbd.c: vec1 = _mm256_srli_epi32(vec0, 1);
cbd.c: ^
cbd.c: cbd.c:29:12: error: always_inline function '_mm256_and_si256' requires target feature 'avx2', but would be inlined into function 'cbd' that is compiled without support for 'avx2'
cbd.c: vec0 = _mm256_and_si256(mask55, vec0);
cbd.c: ^
cbd.c: cbd.c:30:12: error: always_inline function '_mm256_and_si256' requires target feature 'avx2', but would be inlined into function 'cbd' that is compiled without support for 'avx2'
cbd.c: vec1 = _mm256_and_si256(mask55, vec1);
cbd.c: ^
cbd.c: cbd.c:31:12: error: always_inline function '_mm256_add_epi32' requires target feature 'avx2', but would be inlined into function 'cbd' that is compiled without support for 'avx2'
cbd.c: vec0 = _mm256_add_epi32(vec0, vec1);
cbd.c: ^
cbd.c: cbd.c:33:12: error: always_inline function '_mm256_srli_epi32' requires target feature 'avx2', but would be inlined into function 'cbd' that is compiled without support for 'avx2'
cbd.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2

Namespace violations

Implementation: avx2
Security model: unknown
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
KeccakP-1600-times4-SIMD256.o KeccakF1600times4_FastLoop_Absorb T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_12rounds_FastLoop_Absorb T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_AddBytes T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_AddLanesAll T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_ExtractAndAddBytes T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_ExtractAndAddLanesAll T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_ExtractBytes T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_ExtractLanesAll T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_InitializeAll T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_OverwriteBytes T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_OverwriteLanesAll T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_OverwriteWithZeroes T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_PermuteAll_12rounds T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_PermuteAll_24rounds T
basemul.o basemul_acc_avx T
basemul.o basemul_avx T
cbd.o cbd T
consts.o _16xfhi R
consts.o _16xflo R
consts.o _16xmask R
consts.o _16xmontsqhi R
consts.o _16xmontsqlo R
consts.o _16xq R
consts.o _16xqinv R
consts.o _16xv R
consts.o zetas_exp R
consts.o zetas_inv_exp R
fips202.o KeccakF1600_StatePermute T
fips202.o sha3_256 T
fips202.o sha3_512 T
fips202.o shake128_absorb T
fips202.o shake128_squeezeblocks T
fips202.o shake256 T
fips202x4.o kyber_shake128x4_absorb T
fips202x4.o shake128x4_squeezeblocks T
fips202x4.o shake256x4_prf T
fq.o csubq_avx T
fq.o frommont_avx T
fq.o reduce_avx T
indcpa.o gen_matrix T
indcpa.o indcpa_dec T
indcpa.o indcpa_enc T
indcpa.o indcpa_keypair T
invntt.o invntt_level6_avx T
invntt.o invntt_levels0t5_avx T
ntt.o ntt_level0_avx T
ntt.o ntt_levels1t6_avx T
poly.o poly_add T
poly.o poly_basemul T
poly.o poly_compress T
poly.o poly_csubq T
poly.o poly_decompress T
poly.o poly_frombytes T
poly.o poly_frommont T
poly.o poly_frommsg T
poly.o poly_getnoise T
poly.o poly_getnoise4x T
poly.o poly_invntt T
poly.o poly_ntt T
poly.o poly_nttunpack T
poly.o poly_reduce T
poly.o poly_sub T
poly.o poly_tobytes T
poly.o poly_tomsg T
polyvec.o polyvec_add T
polyvec.o polyvec_compress T
polyvec.o polyvec_csubq T
polyvec.o polyvec_decompress T
polyvec.o polyvec_frombytes T
polyvec.o polyvec_invntt T
polyvec.o polyvec_ntt T
polyvec.o polyvec_pointwise_acc T
polyvec.o polyvec_reduce T
polyvec.o polyvec_tobytes T
rejsample.o rej_uniform T
shuffle.o nttfrombytes_avx T
shuffle.o ntttobytes_avx T
shuffle.o nttunpack_avx T
symmetric-fips202.o kyber_shake128_absorb T
symmetric-fips202.o kyber_shake128_squeezeblocks T
symmetric-fips202.o shake256_prf T
verify.o cmov T
verify.o verify T

Number of similar (compiler,implementation) pairs: 8, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2

Namespace violations

Implementation: ref
Security model: unknown
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
cbd.o cbd T
fips202.o KeccakF1600_StatePermute T
fips202.o sha3_256 T
fips202.o sha3_512 T
fips202.o shake128_absorb T
fips202.o shake128_squeezeblocks T
fips202.o shake256 T
indcpa.o gen_matrix T
indcpa.o indcpa_dec T
indcpa.o indcpa_enc T
indcpa.o indcpa_keypair T
ntt.o basemul T
ntt.o invntt T
ntt.o ntt T
ntt.o zetas D
ntt.o zetas_inv D
poly.o poly_add T
poly.o poly_basemul T
poly.o poly_compress T
poly.o poly_csubq T
poly.o poly_decompress T
poly.o poly_frombytes T
poly.o poly_frommont T
poly.o poly_frommsg T
poly.o poly_getnoise T
poly.o poly_invntt T
poly.o poly_ntt T
poly.o poly_reduce T
poly.o poly_sub T
poly.o poly_tobytes T
poly.o poly_tomsg T
polyvec.o polyvec_add T
polyvec.o polyvec_compress T
polyvec.o polyvec_csubq T
polyvec.o polyvec_decompress T
polyvec.o polyvec_frombytes T
polyvec.o polyvec_invntt T
polyvec.o polyvec_ntt T
polyvec.o polyvec_pointwise_acc T
polyvec.o polyvec_reduce T
polyvec.o polyvec_tobytes T
reduce.o barrett_reduce T
reduce.o csubq T
reduce.o montgomery_reduce T
symmetric-fips202.o kyber_shake128_absorb T
symmetric-fips202.o kyber_shake128_squeezeblocks T
symmetric-fips202.o shake256_prf T
verify.o cmov T
verify.o verify T

Number of similar (compiler,implementation) pairs: 9, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE ref
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE ref
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE ref
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE ref