Implementation notes: amd64, kizomba, crypto_kem/frodokem976

Computer: kizomba
Microarchitecture: amd64; Kaby Lake (906e9)
Architecture: amd64
CPU ID: GenuineIntel-000906e9-1fc9cbf5
SUPERCOP version: 20240808
Operation: crypto_kem
Primitive: frodokem976
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
648618252796 0 871503 856 1608T:optimizedgcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
6770427217987 0 071759 856 1608T:x64gcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
736900651066 0 870542 880 1576T:optimizedclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
739438137656 0 856974 880 1640T:optimizedclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
7412575157823 0 070190 880 1576T:x64clang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
7450468144429 0 057054 880 1640T:x64clang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
907272226361 0 844846 880 1576T:optimizedclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
10599960115287 0 031176 872 1640T:x64clang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
1061892412757 0 831128 872 1640T:optimizedclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
11138887120602 0 033718 880 1576T:x64clang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
11415086120700 0 034495 856 1608T:x64gcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
12260526114989 0 032431 856 1608T:x64gcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
13630812112621 0 028975 848 1576T:x64gcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
2548458216010 0 833494 880 1576T:optimizedclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
3228161348286 38 866967 904 1608T:referencegcc_-march=native_-mtune=native_-O3_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
3404702615664 0 834263 856 1608T:optimizedgcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
3567018014163 0 832327 856 1608T:optimizedgcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
4378939111599 0 828799 848 1576T:optimizedgcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
4541308243576 24 863542 912 1640T:referenceclang_-march=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
4547377253482 24 873574 912 1576T:referenceclang_-march=native_-O2_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
4625452211907 24 830264 904 1640T:referenceclang_-march=native_-Os_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
5265763715714 24 833190 912 1576T:referenceclang_-march=native_-O_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
5296151425521 24 844334 912 1576T:referenceclang_-mcpu=native_-O3_-fwrapv_-Qunused-arguments_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
5706376814906 38 833511 904 1608T:referencegcc_-march=native_-mtune=native_-O2_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
5920506813582 38 831743 904 1608T:referencegcc_-march=native_-mtune=native_-O_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716
7761858110986 38 828191 896 1576T:referencegcc_-march=native_-mtune=native_-Os_-fwrapv_-fPIC_-fPIE_-gdwarf-4_-Wall2024072020240716

Compiler output


fips202.c: fips202.c:428:39: warning: argument 1 of type 'uint64_t[25]' {aka 'long unsigned int[25]'} with mismatched bound [-Warray-parameter=]
fips202.c:   428 | void cshake128_simple_absorb(uint64_t s[25], uint16_t cstm, const unsigned char *in, unsigned long long inlen)
fips202.c:       |                              ~~~~~~~~~^~~~~
fips202.c: In file included from fips202.c:16:
fips202.c: fips202.h:14:40: note: previously declared as 'uint64_t *' {aka 'long unsigned int *'}
fips202.c:    14 | void cshake128_simple_absorb(uint64_t *s, uint16_t cstm, const unsigned char *in, unsigned long long inlen);
fips202.c:       |                              ~~~~~~~~~~^
fips202.c: fips202.c:524:39: warning: argument 1 of type 'uint64_t[25]' {aka 'long unsigned int[25]'} with mismatched bound [-Warray-parameter=]
fips202.c:   524 | void cshake256_simple_absorb(uint64_t s[25], uint16_t cstm, const unsigned char *in, unsigned long long inlen)
fips202.c:       |                              ~~~~~~~~~^~~~~
fips202.c: In file included from fips202.c:16:
fips202.c: fips202.h:22:40: note: previously declared as 'uint64_t *' {aka 'long unsigned int *'}
fips202.c:    22 | void cshake256_simple_absorb(uint64_t *s, uint16_t cstm, const unsigned char *in, unsigned long long inlen);
fips202.c:       |                              ~~~~~~~~~~^

Number of similar (implementation,compiler) pairs: 12, namely:
ImplementationCompiler
T:optimizedgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:optimizedgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:optimizedgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:optimizedgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:referencegcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:referencegcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:referencegcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:referencegcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:x64gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:x64gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:x64gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:x64gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Compiler output


KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'KeccakP1600times4_AddLanesAll' that is compiled without support for 'avx'
KeccakP-1600-times4-SIMD256.c:         Xor_In4( 0 );
KeccakP-1600-times4-SIMD256.c:         ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:42: note: expanded from macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c:     #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c:                                          ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:44:37: note: expanded from macro 'LOAD256u'
KeccakP-1600-times4-SIMD256.c:     #define LOAD256u(a)             _mm256_loadu_si256((const V256 *)&(a))
KeccakP-1600-times4-SIMD256.c:                                     ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:135:42: note: expanded from macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c:     #define Xor_In4( argIndex ) lanes0 = LOAD256u( curData0[argIndex]),\
KeccakP-1600-times4-SIMD256.c:                                          ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:44:37: note: expanded from macro 'LOAD256u'
KeccakP-1600-times4-SIMD256.c:     #define LOAD256u(a)             _mm256_loadu_si256((const V256 *)&(a))
KeccakP-1600-times4-SIMD256.c:                                     ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: error: always_inline function '_mm256_loadu_si256' requires target feature 'avx', but would be inlined into function 'KeccakP1600times4_AddLanesAll' that is compiled without support for 'avx'
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:136:42: note: expanded from macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c:                                 lanes1 = LOAD256u( curData1[argIndex]),\
KeccakP-1600-times4-SIMD256.c:                                          ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:44:37: note: expanded from macro 'LOAD256u'
KeccakP-1600-times4-SIMD256.c:     #define LOAD256u(a)             _mm256_loadu_si256((const V256 *)&(a))
KeccakP-1600-times4-SIMD256.c:                                     ^
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:146:9: error: AVX vector return of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
KeccakP-1600-times4-SIMD256.c: KeccakP-1600-times4-SIMD256.c:136:42: note: expanded from macro 'Xor_In4'
KeccakP-1600-times4-SIMD256.c: ...

Number of similar (implementation,compiler) pairs: 1, namely:
ImplementationCompiler
T:x64clang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)

Namespace violations


aes.o AES128_free_schedule T
aes.o AES256_free_schedule T
aes.o handleErrors T
aes_c.o aes128_enc_c T
aes_c.o aes128_load_schedule_c T
aes_c.o aes256_enc_c T
aes_c.o aes256_load_schedule_c T
fips202.o KeccakF1600_StatePermute T
fips202.o cshake128_simple T
fips202.o cshake128_simple_absorb T
fips202.o cshake128_simple_squeezeblocks T
fips202.o cshake256_simple T
fips202.o cshake256_simple_absorb T
fips202.o cshake256_simple_squeezeblocks T
fips202.o shake128 T
fips202.o shake128_absorb T
fips202.o shake128_squeezeblocks T
fips202.o shake256 T
fips202.o shake256_absorb T
fips202.o shake256_squeezeblocks T
frodo976.o CDF_TABLE R
frodo976.o CDF_TABLE_LEN R
frodo976.o frodo_add T
frodo976.o frodo_key_decode T
frodo976.o frodo_key_encode T
frodo976.o frodo_mul_add_as_plus_e T
frodo976.o frodo_mul_add_sa_plus_e T
frodo976.o frodo_mul_add_sb_plus_e T
frodo976.o frodo_mul_bs T
frodo976.o frodo_sample_n T
frodo976.o frodo_sub T
util.o clear_words T
util.o frodo_pack T
util.o frodo_unpack T

Number of similar (implementation,compiler) pairs: 9, namely:
ImplementationCompiler
T:optimizedclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:optimizedclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:optimizedclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:optimizedclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:optimizedclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:optimizedgcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:optimizedgcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:optimizedgcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:optimizedgcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Namespace violations


aes.o AES128_free_schedule T
aes.o AES256_free_schedule T
aes.o handleErrors T
aes_c.o aes128_enc_c T
aes_c.o aes128_load_schedule_c T
aes_c.o aes256_enc_c T
aes_c.o aes256_load_schedule_c T
fips202.o KeccakF1600_StatePermute T
fips202.o cshake128_simple T
fips202.o cshake128_simple_absorb T
fips202.o cshake128_simple_squeezeblocks T
fips202.o cshake256_simple T
fips202.o cshake256_simple_absorb T
fips202.o cshake256_simple_squeezeblocks T
fips202.o shake128 T
fips202.o shake128_absorb T
fips202.o shake128_squeezeblocks T
fips202.o shake256 T
fips202.o shake256_absorb T
fips202.o shake256_squeezeblocks T
frodo976.o CDF_TABLE D
frodo976.o CDF_TABLE_LEN D
frodo976.o frodo_add T
frodo976.o frodo_key_decode T
frodo976.o frodo_key_encode T
frodo976.o frodo_mul_add_as_plus_e T
frodo976.o frodo_mul_add_sa_plus_e T
frodo976.o frodo_mul_add_sb_plus_e T
frodo976.o frodo_mul_bs T
frodo976.o frodo_sample_n T
frodo976.o frodo_sub T
util.o clear_words T
util.o frodo_pack T
util.o frodo_unpack T

Number of similar (implementation,compiler) pairs: 9, namely:
ImplementationCompiler
T:referenceclang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:referenceclang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:referenceclang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:referenceclang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:referenceclang -mcpu=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:referencegcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:referencegcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:referencegcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:referencegcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)

Namespace violations


KeccakP-1600-times4-SIMD256.o KeccakF1600times4_FastLoop_Absorb T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_12rounds_FastLoop_Absorb T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_AddBytes T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_AddLanesAll T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_ExtractAndAddBytes T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_ExtractAndAddLanesAll T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_ExtractBytes T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_ExtractLanesAll T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_InitializeAll T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_OverwriteBytes T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_OverwriteLanesAll T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_OverwriteWithZeroes T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_PermuteAll_12rounds T
KeccakP-1600-times4-SIMD256.o KeccakP1600times4_PermuteAll_24rounds T
aes.o AES_free_schedule T
aes.o handleErrors T
aes_ni.o aes128_dec_ni T
aes_ni.o aes128_enc_ni T
aes_ni.o aes128_load_schedule_ni T
aes_ni.o aes256_dec_ni T
aes_ni.o aes256_enc_ni T
aes_ni.o aes256_load_schedule_ni T
aes_ni.o aes_free_schedule_ni T
fips202.o KeccakF1600_StatePermute T
fips202.o cshake128_simple T
fips202.o cshake128_simple_absorb T
fips202.o cshake128_simple_squeezeblocks T
fips202.o cshake256_simple T
fips202.o cshake256_simple_absorb T
fips202.o cshake256_simple_squeezeblocks T
fips202.o shake128 T
fips202.o shake128_absorb T
fips202.o shake128_squeezeblocks T
fips202.o shake256 T
fips202.o shake256_absorb T
fips202.o shake256_squeezeblocks T
fips202x4.o cshake128_simple4x T
fips202x4.o cshake128_simple_absorb4x T
fips202x4.o cshake128_simple_squeezeblocks4x T
fips202x4.o cshake256_simple4x T
fips202x4.o cshake256_simple_absorb4x T
fips202x4.o cshake256_simple_squeezeblocks4x T
frodo976.o CDF_TABLE R
frodo976.o CDF_TABLE_LEN R
frodo976.o frodo_add T
frodo976.o frodo_key_decode T
frodo976.o frodo_key_encode T
frodo976.o frodo_mul_add_as_plus_e T
frodo976.o frodo_mul_add_sa_plus_e T
frodo976.o frodo_mul_add_sb_plus_e T
frodo976.o frodo_mul_bs T
frodo976.o frodo_sample_n T
frodo976.o frodo_sub T
util.o clear_words T
util.o frodo_pack T
util.o frodo_unpack T

Number of similar (implementation,compiler) pairs: 8, namely:
ImplementationCompiler
T:x64clang -march=native -O2 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:x64clang -march=native -O3 -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:x64clang -march=native -O -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:x64clang -march=native -Os -fwrapv -Qunused-arguments -fPIC -fPIE -gdwarf-4 -Wall (Ubuntu_Clang_14.0.0)
T:x64gcc -march=native -mtune=native -O2 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:x64gcc -march=native -mtune=native -O3 -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:x64gcc -march=native -mtune=native -O -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)
T:x64gcc -march=native -mtune=native -Os -fwrapv -fPIC -fPIE -gdwarf-4 -Wall (11.4.0)