Implementation notes: amd64, scw1b63b1, crypto_sign/dilithium4aes

Computer: scw1b63b1
Architecture: amd64
CPU ID: GenuineIntel-000506f1-0f8bfbff
SUPERCOP version: 20191017
Operation: crypto_sign
Primitive: dilithium4aes
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
400830231156 0 050161 792 1600refclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2019121720191017
401588631140 0 049321 792 1600refclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2019121720191017
401884631140 0 049321 792 1600refclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2019121720191017
404450668040 0 088692 816 1632refgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121720191017
410384031845 0 050785 792 1600refclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2019121720191017
413451021339 0 038555 784 1600refclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2019121720191017
434481224970 0 043748 816 1632refgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121720191017
445884623811 0 042332 816 1632refgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121720191017
484462221998 0 039380 808 1600refgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2019121720191017

Compiler output

Implementation: avx2
Security model: unknown
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
fips202x4.c: fips202x4.c:52:12: error: always_inline function '_mm256_xor_si256' requires target feature 'xsave', but would be inlined into function 'keccak_absorb4x' that is compiled without support for 'xsave'
fips202x4.c: s[i] = _mm256_xor_si256(s[i], s[i]);
fips202x4.c: ^
fips202x4.c: 1 error generated.

Number of similar (compiler,implementation) pairs: 4, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2

Compiler output

Implementation: avx2
Security model: unknown
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
aes256ctr.c: aes256ctr.c:119:3: error: '__builtin_ia32_aeskeygenassist128' needs target feature aes
aes256ctr.c: BLOCK1(0x01);
aes256ctr.c: ^
aes256ctr.c: aes256ctr.c:100:11: note: expanded from macro 'BLOCK1'
aes256ctr.c: temp1 = _mm_aeskeygenassist_si128(temp2, IMM); ^
aes256ctr.c: /usr/lib/llvm-3.8/bin/../lib/clang/3.8.1/include/__wmmintrin_aes.h:62:12: note: expanded from macro '_mm_aeskeygenassist_si128'
aes256ctr.c: (__m128i)__builtin_ia32_aeskeygenassist128((__v2di)(__m128i)(C), (int)(R))
aes256ctr.c: ^
aes256ctr.c: aes256ctr.c:120:3: error: '__builtin_ia32_aeskeygenassist128' needs target feature aes
aes256ctr.c: ...
aes256ctr.c: aes256ctr.c:137:3: error: '__builtin_ia32_aeskeygenassist128' needs target feature aes
aes256ctr.c: BLOCK1(0x40);
aes256ctr.c: ^
aes256ctr.c: aes256ctr.c:100:11: note: expanded from macro 'BLOCK1'
aes256ctr.c: temp1 = _mm_aeskeygenassist_si128(temp2, IMM); ^
aes256ctr.c: /usr/lib/llvm-3.8/bin/../lib/clang/3.8.1/include/__wmmintrin_aes.h:62:12: note: expanded from macro '_mm_aeskeygenassist_si128'
aes256ctr.c: (__m128i)__builtin_ia32_aeskeygenassist128((__v2di)(__m128i)(C), (int)(R))
aes256ctr.c: ^
aes256ctr.c: 13 errors generated.

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2

Compiler output

Implementation: avx2
Security model: unknown
Compiler: gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE
fips202x4.c: fips202x4.c: In function ‘keccak_absorb4x’:
fips202x4.c: fips202x4.c:52:10: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
fips202x4.c: s[i] = _mm256_xor_si256(s[i], s[i]);
fips202x4.c: ~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fips202x4.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/6/include/immintrin.h:43:0,
fips202x4.c: from fips202x4.c:2:
fips202x4.c: /usr/lib/gcc/x86_64-linux-gnu/6/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline ‘_mm256_xor_si256’: target specific option mismatch
fips202x4.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
fips202x4.c: ^~~~~~~~~~~~~~~~
fips202x4.c: fips202x4.c:52:12: note: called from here
fips202x4.c: s[i] = _mm256_xor_si256(s[i], s[i]);
fips202x4.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
fips202x4.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/6/include/immintrin.h:43:0,
fips202x4.c: from fips202x4.c:2:
fips202x4.c: /usr/lib/gcc/x86_64-linux-gnu/6/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline ‘_mm256_xor_si256’: target specific option mismatch
fips202x4.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
fips202x4.c: ^~~~~~~~~~~~~~~~
fips202x4.c: fips202x4.c:52:12: note: called from here
fips202x4.c: s[i] = _mm256_xor_si256(s[i], s[i]);
fips202x4.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~

Number of similar (compiler,implementation) pairs: 2, namely:
CompilerImplementations
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2

Compiler output

Implementation: avx2
Security model: unknown
Compiler: gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE
fips202x4.c: fips202x4.c: In function ‘keccak_absorb4x’:
fips202x4.c: fips202x4.c:52:10: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
fips202x4.c: s[i] = _mm256_xor_si256(s[i], s[i]);
fips202x4.c: ~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
fips202x4.c: In file included from /usr/lib/gcc/x86_64-linux-gnu/6/include/immintrin.h:43:0,
fips202x4.c: from fips202x4.c:2:
fips202x4.c: /usr/lib/gcc/x86_64-linux-gnu/6/include/avx2intrin.h:913:1: error: inlining failed in call to always_inline ‘_mm256_xor_si256’: target specific option mismatch
fips202x4.c: _mm256_xor_si256 (__m256i __A, __m256i __B)
fips202x4.c: ^~~~~~~~~~~~~~~~
fips202x4.c: fips202x4.c:52:12: note: called from here
fips202x4.c: s[i] = _mm256_xor_si256(s[i], s[i]);
fips202x4.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~

Number of similar (compiler,implementation) pairs: 2, namely:
CompilerImplementations
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE avx2

Namespace violations

Implementation: ref
Security model: unknown
Compiler: clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
aes256ctr.o aes256_prf T
aes256ctr.o aes256ctr_init T
aes256ctr.o aes256ctr_squeezeblocks T
aes256ctr.o br_range_enc32le T
fips202.o shake128 T
fips202.o shake128_absorb T
fips202.o shake128_squeezeblocks T
fips202.o shake128_stream_init T
fips202.o shake256 T
fips202.o shake256_absorb T
fips202.o shake256_squeezeblocks T
fips202.o shake256_stream_init T
ntt.o invntt_frominvmont T
ntt.o ntt T
packing.o pack_pk T
packing.o pack_sig T
packing.o pack_sk T
packing.o unpack_pk T
packing.o unpack_sig T
packing.o unpack_sk T
poly.o poly_add T
poly.o poly_chknorm T
poly.o poly_csubq T
poly.o poly_decompose T
poly.o poly_freeze T
poly.o poly_invntt_montgomery T
poly.o poly_make_hint T
poly.o poly_ntt T
poly.o poly_pointwise_invmontgomery T
poly.o poly_power2round T
poly.o poly_reduce T
poly.o poly_shiftl T
poly.o poly_sub T
poly.o poly_uniform T
poly.o poly_uniform_eta T
poly.o poly_uniform_gamma1m1 T
poly.o poly_use_hint T
poly.o polyeta_pack T
poly.o polyeta_unpack T
poly.o polyt0_pack T
poly.o polyt0_unpack T
poly.o polyt1_pack T
poly.o polyt1_unpack T
poly.o polyw1_pack T
poly.o polyz_pack T
poly.o polyz_unpack T
polyvec.o polyveck_add T
polyvec.o polyveck_chknorm T
polyvec.o polyveck_csubq T
polyvec.o polyveck_decompose T
polyvec.o polyveck_freeze T
polyvec.o polyveck_invntt_montgomery T
polyvec.o polyveck_make_hint T
polyvec.o polyveck_ntt T
polyvec.o polyveck_power2round T
polyvec.o polyveck_reduce T
polyvec.o polyveck_shiftl T
polyvec.o polyveck_sub T
polyvec.o polyveck_use_hint T
polyvec.o polyvecl_add T
polyvec.o polyvecl_chknorm T
polyvec.o polyvecl_freeze T
polyvec.o polyvecl_ntt T
polyvec.o polyvecl_pointwise_acc_invmontgomery T
reduce.o csubq T
reduce.o freeze T
reduce.o montgomery_reduce T
reduce.o reduce32 T
rounding.o decompose T
rounding.o make_hint T
rounding.o power2round T
rounding.o use_hint T
sign.o challenge T
sign.o expand_mat T

Number of similar (compiler,implementation) pairs: 9, namely:
CompilerImplementations
clang -march=native -O2 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref
clang -march=native -O -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref
clang -march=native -Os -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE ref
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv -fPIC -fPIE ref
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv -fPIC -fPIE ref
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv -fPIC -fPIE ref
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv -fPIC -fPIE ref