Implementation notes: amd64, genji441, crypto_sign/luov890351pc

Computer: genji441
Architecture: amd64
CPU ID: GenuineIntel-000406f1-bfebfbff
SUPERCOP version: 20180818
Operation: crypto_sign
Primitive: luov890351pc

Time	Object size	Test size	Implementation	Compiler	Benchmark date	SUPERCOP version
1445504	? ? ?	? ? ?	`avx2`	`icc_-xCORE-AVX2_-O2_-fomit-frame-pointer`	20180820	20180818
1461680	? ? ?	? ? ?	`avx2`	`icc_-xCORE-AVX2_-O3_-fomit-frame-pointer`	20180820	20180818
1480836	? ? ?	? ? ?	`avx2`	`icc_-xAVX_-O2_-fomit-frame-pointer`	20180820	20180818
1513172	? ? ?	? ? ?	`avx2`	`icc_-xCORE-AVX-I_-O2_-fomit-frame-pointer`	20180820	20180818
1525188	? ? ?	? ? ?	`avx2`	`icc_-xAVX_-O3_-fomit-frame-pointer`	20180820	20180818
1535248	? ? ?	? ? ?	`avx2`	`icc_-xSSE4.1_-O3_-fomit-frame-pointer`	20180820	20180818
1536632	? ? ?	? ? ?	`avx2`	`icc_-xCORE-AVX-I_-O3_-fomit-frame-pointer`	20180820	20180818
1545492	? ? ?	? ? ?	`avx2`	`icc_-xSSE4.2_-O3_-fomit-frame-pointer`	20180820	20180818
1548840	? ? ?	? ? ?	`avx2`	`icc_-no-vec`	20180820	20180818
1550336	? ? ?	? ? ?	`avx2`	`icc`	20180820	20180818
1556256	? ? ?	? ? ?	`avx2`	`icc_-xSSE4.2_-O2_-fomit-frame-pointer`	20180820	20180818
1566624	? ? ?	? ? ?	`avx2`	`icc_-xSSE4.1_-O2_-fomit-frame-pointer`	20180820	20180818

Compiler output

Implementation: avx2
Security model: unknown
Compiler: cc

LUOV.c: In file included from LinearAlgebra.h:9:0,
LUOV.c: from LUOV.h:13,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h:22:21: error: unknown type name '__m256i'
LUOV.c: void print256_num32(__m256i var)
LUOV.c: ^
LUOV.c: AVX_Operations.h:31:20: error: unknown type name '__m256i'
LUOV.c: void print256_num8(__m256i var)
LUOV.c: ^
LUOV.c: In file included from LinearAlgebra.h:9:0,
LUOV.c: from LUOV.h:13,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h:47:26: error: unknown type name '__m256i'
LUOV.c: void addScalarProductAVX(__m256i *V, FELT a, bitcontainer b) {
LUOV.c: ^
LUOV.c: AVX_Operations.h:100:27: error: unknown type name '__m256i'
LUOV.c: void addScalarProduct3AVX(__m256i *V1, FELT a1, __m256i *V2, FELT a2, __m256i *V3, FELT a3, const bitcontainer b) {
LUOV.c: ^
LUOV.c: AVX_Operations.h:100:49: error: unknown type name '__m256i'
LUOV.c: void addScalarProduct3AVX(__m256i *V1, FELT a1, __m256i *V2, FELT a2, __m256i *V3, FELT a3, const bitcontainer b) {
LUOV.c: ^
LUOV.c: AVX_Operations.h:100:71: error: unknown type name '__m256i'
LUOV.c: void addScalarProduct3AVX(__m256i *V1, FELT a1, __m256i *V2, FELT a2, __m256i *V3, FELT a3, const bitcontainer b) {
LUOV.c: ^
LUOV.c: AVX_Operations.h:158:27: error: unknown type name '__m256i'
LUOV.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:

Compiler	Implementations
cc	avx2

Compiler output

Implementation: avx2
Security model: unknown
Compiler: gcc

LUOV.c: In file included from /home_nfs_robin_ib/bdolbeaur/gcc-8.2.0-full+isl/lib/gcc/x86_64-pc-linux-gnu/8.2.0/include/immintrin.h:43,
LUOV.c: from LUOV.h:7,
LUOV.c: from LUOV.c:1:
LUOV.c: LUOV.c: In function 'calculateQ2':
LUOV.c: LUOV.c:110:77: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
LUOV.c: __m256i rr = _mm256_permute2x128_si256(_mm256_loadu_si256((__m256i *)&r),_mm256_setzero_si256(),0);
LUOV.c: ^
LUOV.c: In file included from LinearAlgebra.h:9,
LUOV.c: from LUOV.h:13,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h: In function 'scalarMul_ct':
LUOV.c: AVX_Operations.h:529:6: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
LUOV.c: void scalarMul_ct(__m256i *Out, __m256i A, FELT b){
LUOV.c: ^~~~~~~~~~~~
LUOV.c: In file included from /home_nfs_robin_ib/bdolbeaur/gcc-8.2.0-full+isl/lib/gcc/x86_64-pc-linux-gnu/8.2.0/include/immintrin.h:43,
LUOV.c: from LUOV.h:7,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h: In function 'addScalarProductAVX':
LUOV.c: /home_nfs_robin_ib/bdolbeaur/gcc-8.2.0-full+isl/lib/gcc/x86_64-pc-linux-gnu/8.2.0/include/avx2intrin.h:186:1: error: inlining failed in call to always_inline '_mm256_andnot_si256': target specific option mismatch
LUOV.c: _mm256_andnot_si256 (__m256i __A, __m256i __B)
LUOV.c: ^~~~~~~~~~~~~~~~~~~
LUOV.c: In file included from LinearAlgebra.h:9,
LUOV.c: from LUOV.h:13,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h:80:9: note: called from here
LUOV.c: ...

Number of similar (compiler,implementation) pairs: 2, namely:

Compiler	Implementations
gcc	avx2
gcc -funroll-loops	avx2

Compiler output

Implementation: avx2
Security model: unknown
Compiler: gcc -O2 -fomit-frame-pointer

Number of similar (compiler,implementation) pairs: 84, namely:

Compiler	Implementations
gcc -O2 -fomit-frame-pointer	avx2
gcc -O3 -fomit-frame-pointer	avx2
gcc -O -fomit-frame-pointer	avx2
gcc -Os -fomit-frame-pointer	avx2
gcc -fno-schedule-insns -O2 -fomit-frame-pointer	avx2
gcc -fno-schedule-insns -O3 -fomit-frame-pointer	avx2
gcc -fno-schedule-insns -O -fomit-frame-pointer	avx2
gcc -fno-schedule-insns -Os -fomit-frame-pointer	avx2
gcc -funroll-loops -O2 -fomit-frame-pointer	avx2
gcc -funroll-loops -O3 -fomit-frame-pointer	avx2
gcc -funroll-loops -O -fomit-frame-pointer	avx2
gcc -funroll-loops -Os -fomit-frame-pointer	avx2
gcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer	avx2
gcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer	avx2
gcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer	avx2
gcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer	avx2
gcc -funroll-loops -m64 -O2 -fomit-frame-pointer	avx2
gcc -funroll-loops -m64 -O3 -fomit-frame-pointer	avx2
gcc -funroll-loops -m64 -O -fomit-frame-pointer	avx2
gcc -funroll-loops -m64 -Os -fomit-frame-pointer	avx2
gcc -funroll-loops -m64 -march=barcelona -O2 -fomit-frame-pointer	avx2
gcc -funroll-loops -m64 -march=barcelona -O3 -fomit-frame-pointer	avx2
gcc -funroll-loops -m64 -march=barcelona -O -fomit-frame-pointer	avx2
gcc -funroll-loops -m64 -march=barcelona -Os -fomit-frame-pointer	avx2
gcc -funroll-loops -m64 -march=k8 -O2 -fomit-frame-pointer	avx2
gcc -funroll-loops -m64 -march=k8 -O3 -fomit-frame-pointer	avx2
gcc -funroll-loops -m64 -march=k8 -O -fomit-frame-pointer	avx2
gcc -funroll-loops -m64 -march=k8 -Os -fomit-frame-pointer	avx2
gcc -funroll-loops -m64 -march=nocona -O2 -fomit-frame-pointer	avx2
gcc -funroll-loops -m64 -march=nocona -O3 -fomit-frame-pointer	avx2
gcc -funroll-loops -m64 -march=nocona -O -fomit-frame-pointer	avx2
gcc -funroll-loops -m64 -march=nocona -Os -fomit-frame-pointer	avx2
gcc -funroll-loops -march=barcelona -O2 -fomit-frame-pointer	avx2
gcc -funroll-loops -march=barcelona -O3 -fomit-frame-pointer	avx2
gcc -funroll-loops -march=barcelona -O -fomit-frame-pointer	avx2
gcc -funroll-loops -march=barcelona -Os -fomit-frame-pointer	avx2
gcc -funroll-loops -march=k8 -O2 -fomit-frame-pointer	avx2
gcc -funroll-loops -march=k8 -O3 -fomit-frame-pointer	avx2
gcc -funroll-loops -march=k8 -O -fomit-frame-pointer	avx2
gcc -funroll-loops -march=k8 -Os -fomit-frame-pointer	avx2
gcc -funroll-loops -march=nocona -O2 -fomit-frame-pointer	avx2
gcc -funroll-loops -march=nocona -O3 -fomit-frame-pointer	avx2
gcc -funroll-loops -march=nocona -O -fomit-frame-pointer	avx2
gcc -funroll-loops -march=nocona -Os -fomit-frame-pointer	avx2
gcc -m64 -O2 -fomit-frame-pointer	avx2
gcc -m64 -O3 -fomit-frame-pointer	avx2
gcc -m64 -O -fomit-frame-pointer	avx2
gcc -m64 -Os -fomit-frame-pointer	avx2
gcc -m64 -march=core2 -O2 -fomit-frame-pointer	avx2
gcc -m64 -march=core2 -O3 -fomit-frame-pointer	avx2
gcc -m64 -march=core2 -O -fomit-frame-pointer	avx2
gcc -m64 -march=core2 -Os -fomit-frame-pointer	avx2
gcc -m64 -march=core2 -msse4.1 -O2 -fomit-frame-pointer	avx2
gcc -m64 -march=core2 -msse4.1 -O3 -fomit-frame-pointer	avx2
gcc -m64 -march=core2 -msse4.1 -O -fomit-frame-pointer	avx2
gcc -m64 -march=core2 -msse4.1 -Os -fomit-frame-pointer	avx2
gcc -m64 -march=core2 -msse4 -O2 -fomit-frame-pointer	avx2
gcc -m64 -march=core2 -msse4 -O3 -fomit-frame-pointer	avx2
gcc -m64 -march=core2 -msse4 -O -fomit-frame-pointer	avx2
gcc -m64 -march=core2 -msse4 -Os -fomit-frame-pointer	avx2
gcc -m64 -march=corei7 -O2 -fomit-frame-pointer	avx2
gcc -m64 -march=corei7 -O3 -fomit-frame-pointer	avx2
gcc -m64 -march=corei7 -O -fomit-frame-pointer	avx2
gcc -m64 -march=corei7 -Os -fomit-frame-pointer	avx2
gcc -m64 -march=k8 -O2 -fomit-frame-pointer	avx2
gcc -m64 -march=k8 -O3 -fomit-frame-pointer	avx2
gcc -m64 -march=k8 -O -fomit-frame-pointer	avx2
gcc -m64 -march=k8 -Os -fomit-frame-pointer	avx2
gcc -m64 -march=nocona -O2 -fomit-frame-pointer	avx2
gcc -m64 -march=nocona -O3 -fomit-frame-pointer	avx2
gcc -m64 -march=nocona -O -fomit-frame-pointer	avx2
gcc -m64 -march=nocona -Os -fomit-frame-pointer	avx2
gcc -march=barcelona -O2 -fomit-frame-pointer	avx2
gcc -march=barcelona -O3 -fomit-frame-pointer	avx2
gcc -march=barcelona -O -fomit-frame-pointer	avx2
gcc -march=barcelona -Os -fomit-frame-pointer	avx2
gcc -march=k8 -O2 -fomit-frame-pointer	avx2
gcc -march=k8 -O3 -fomit-frame-pointer	avx2
gcc -march=k8 -O -fomit-frame-pointer	avx2
gcc -march=k8 -Os -fomit-frame-pointer	avx2
gcc -march=nocona -O2 -fomit-frame-pointer	avx2
gcc -march=nocona -O3 -fomit-frame-pointer	avx2
gcc -march=nocona -O -fomit-frame-pointer	avx2
gcc -march=nocona -Os -fomit-frame-pointer	avx2

Compiler output

Implementation: avx2
Security model: unknown
Compiler: gcc -m64 -march=barcelona -O2 -fomit-frame-pointer

LUOV.c: LUOV.c: In function 'calculateQ2':
LUOV.c: LUOV.c:110:12: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
LUOV.c: __m256i rr = _mm256_permute2x128_si256(_mm256_loadu_si256((__m256i *)&r),_mm256_setzero_si256(),0);
LUOV.c: ^~
LUOV.c: In file included from LinearAlgebra.h:9,
LUOV.c: from LUOV.h:13,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h: In function 'scalarMul_ct':
LUOV.c: AVX_Operations.h:529:6: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
LUOV.c: void scalarMul_ct(__m256i *Out, __m256i A, FELT b){
LUOV.c: ^~~~~~~~~~~~
LUOV.c: In file included from /home_nfs_robin_ib/bdolbeaur/gcc-8.2.0-full+isl/lib/gcc/x86_64-pc-linux-gnu/8.2.0/include/immintrin.h:43,
LUOV.c: from LUOV.h:7,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h: In function 'addScalarProductAVX':
LUOV.c: /home_nfs_robin_ib/bdolbeaur/gcc-8.2.0-full+isl/lib/gcc/x86_64-pc-linux-gnu/8.2.0/include/avx2intrin.h:186:1: error: inlining failed in call to always_inline '_mm256_andnot_si256': target specific option mismatch
LUOV.c: _mm256_andnot_si256 (__m256i __A, __m256i __B)
LUOV.c: ^~~~~~~~~~~~~~~~~~~
LUOV.c: In file included from LinearAlgebra.h:9,
LUOV.c: from LUOV.h:13,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h:80:9: note: called from here
LUOV.c: avx3 = _mm256_andnot_si256(avx3,aa);
LUOV.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
LUOV.c: In file included from /home_nfs_robin_ib/bdolbeaur/gcc-8.2.0-full+isl/lib/gcc/x86_64-pc-linux-gnu/8.2.0/include/immintrin.h:43,
LUOV.c: ...
LUOV.c: LUOV.c: In function 'calculateQ2':
LUOV.c: LUOV.c:110:12: warning: AVX vector return without AVX enabled changes the ABI [-Wpsabi]
LUOV.c: __m256i rr = _mm256_permute2x128_si256(_mm256_loadu_si256((__m256i *)&r),_mm256_setzero_si256(),0);
LUOV.c: ^~
LUOV.c: In file included from LinearAlgebra.h:9,
LUOV.c: from LUOV.h:13,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h: In function 'scalarMul_ct':
LUOV.c: AVX_Operations.h:529:6: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
LUOV.c: void scalarMul_ct(__m256i *Out, __m256i A, FELT b){
LUOV.c: ^~~~~~~~~~~~
LUOV.c: In file included from /home_nfs_robin_ib/bdolbeaur/gcc-8.2.0-full+isl/lib/gcc/x86_64-pc-linux-gnu/8.2.0/include/immintrin.h:43,
LUOV.c: from LUOV.h:7,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h: In function 'addScalarProductAVX':
LUOV.c: /home_nfs_robin_ib/bdolbeaur/gcc-8.2.0-full+isl/lib/gcc/x86_64-pc-linux-gnu/8.2.0/include/avx2intrin.h:186:1: error: inlining failed in call to always_inline '_mm256_andnot_si256': target specific option mismatch
LUOV.c: _mm256_andnot_si256 (__m256i __A, __m256i __B)
LUOV.c: ^~~~~~~~~~~~~~~~~~~
LUOV.c: In file included from LinearAlgebra.h:9,
LUOV.c: from LUOV.h:13,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h:80:9: note: called from here
LUOV.c: avx3 = _mm256_andnot_si256(avx3,aa);
LUOV.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
LUOV.c: In file included from /home_nfs_robin_ib/bdolbeaur/gcc-8.2.0-full+isl/lib/gcc/x86_64-pc-linux-gnu/8.2.0/include/immintrin.h:43,
LUOV.c: ...

Number of similar (compiler,implementation) pairs: 4, namely:

Compiler	Implementations
gcc -m64 -march=barcelona -O2 -fomit-frame-pointer	avx2
gcc -m64 -march=barcelona -O3 -fomit-frame-pointer	avx2
gcc -m64 -march=barcelona -O -fomit-frame-pointer	avx2
gcc -m64 -march=barcelona -Os -fomit-frame-pointer	avx2

Compiler output

Implementation: avx2
Security model: unknown
Compiler: gcc -m64 -march=core-avx-i -O2 -fomit-frame-pointer

LUOV.c: In file included from /home_nfs_robin_ib/bdolbeaur/gcc-8.2.0-full+isl/lib/gcc/x86_64-pc-linux-gnu/8.2.0/include/immintrin.h:43,
LUOV.c: from LUOV.h:7,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h: In function 'addScalarProductAVX':
LUOV.c: /home_nfs_robin_ib/bdolbeaur/gcc-8.2.0-full+isl/lib/gcc/x86_64-pc-linux-gnu/8.2.0/include/avx2intrin.h:186:1: error: inlining failed in call to always_inline '_mm256_andnot_si256': target specific option mismatch
LUOV.c: _mm256_andnot_si256 (__m256i __A, __m256i __B)
LUOV.c: ^~~~~~~~~~~~~~~~~~~
LUOV.c: In file included from LinearAlgebra.h:9,
LUOV.c: from LUOV.h:13,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h:80:9: note: called from here
LUOV.c: avx3 = _mm256_andnot_si256(avx3,aa);
LUOV.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
LUOV.c: In file included from /home_nfs_robin_ib/bdolbeaur/gcc-8.2.0-full+isl/lib/gcc/x86_64-pc-linux-gnu/8.2.0/include/immintrin.h:43,
LUOV.c: from LUOV.h:7,
LUOV.c: from LUOV.c:1:
LUOV.c: /home_nfs_robin_ib/bdolbeaur/gcc-8.2.0-full+isl/lib/gcc/x86_64-pc-linux-gnu/8.2.0/include/avx2intrin.h:231:1: error: inlining failed in call to always_inline '_mm256_cmpeq_epi8': target specific option mismatch
LUOV.c: _mm256_cmpeq_epi8 (__m256i __A, __m256i __B)
LUOV.c: ^~~~~~~~~~~~~~~~~
LUOV.c: In file included from LinearAlgebra.h:9,
LUOV.c: from LUOV.h:13,
LUOV.c: from LUOV.c:1:
LUOV.c: AVX_Operations.h:79:9: note: called from here
LUOV.c: avx3 = _mm256_cmpeq_epi8(avx3,_mm256_setzero_si256());
LUOV.c: ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
LUOV.c: ...

Number of similar (compiler,implementation) pairs: 8, namely:

Compiler	Implementations
gcc -m64 -march=core-avx-i -O2 -fomit-frame-pointer	avx2
gcc -m64 -march=core-avx-i -O3 -fomit-frame-pointer	avx2
gcc -m64 -march=core-avx-i -O -fomit-frame-pointer	avx2
gcc -m64 -march=core-avx-i -Os -fomit-frame-pointer	avx2
gcc -m64 -march=corei7-avx -O2 -fomit-frame-pointer	avx2
gcc -m64 -march=corei7-avx -O3 -fomit-frame-pointer	avx2
gcc -m64 -march=corei7-avx -O -fomit-frame-pointer	avx2
gcc -m64 -march=corei7-avx -Os -fomit-frame-pointer	avx2

Compiler output

Implementation: avx2
Security model: unknown
Compiler: gcc -m64 -march=core-avx2 -O2 -fomit-frame-pointer

try.c: /scratch_lustre_DDN7k/bdolbeaur/supercop-20180818/supercop-data/genji441/amd64/lib/knownrandombytes.o: In function `randombytes':
try.c: knownrandombytes.c:(.text+0x...): undefined reference to `_intel_fast_memcpy'
try.c: knownrandombytes.c:(.text+0x...): undefined reference to `_intel_fast_memset'
try.c: /scratch_lustre_DDN7k/bdolbeaur/supercop-20180818/supercop-data/genji441/amd64/lib/libsupercop.a(crypto_stream_chacha20_dolbeau_amd64_avx2-api.o): In function `crypto_stream_chacha20_dolbeau_amd64_avx2':
try.c: api.c:(.text+0x...): undefined reference to `__intel_avx_rep_memset'
try.c: /scratch_lustre_DDN7k/bdolbeaur/supercop-20180818/supercop-data/genji441/amd64/lib/libsupercop.a(crypto_stream_chacha20_dolbeau_amd64_avx2-chacha.o): In function `crypto_stream_chacha20_dolbeau_amd64_avx2_ECRYPT_keystream_bytes':
try.c: chacha.c:(.text+0x...): undefined reference to `__intel_avx_rep_memset'
try.c: collect2: error: ld returned 1 exit status

Number of similar (compiler,implementation) pairs: 12, namely:

Compiler	Implementations
gcc -m64 -march=core-avx2 -O2 -fomit-frame-pointer	avx2
gcc -m64 -march=core-avx2 -O3 -fomit-frame-pointer	avx2
gcc -m64 -march=core-avx2 -O -fomit-frame-pointer	avx2
gcc -m64 -march=core-avx2 -Os -fomit-frame-pointer	avx2
gcc -m64 -march=native -mtune=native -O2 -fomit-frame-pointer	avx2
gcc -m64 -march=native -mtune=native -O3 -fomit-frame-pointer	avx2
gcc -m64 -march=native -mtune=native -O -fomit-frame-pointer	avx2
gcc -m64 -march=native -mtune=native -Os -fomit-frame-pointer	avx2
gcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv	avx2
gcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv	avx2
gcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv	avx2
gcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv	avx2