Implementation notes: amd64, icelake2, crypto_scalarmult/kummer

Computer: icelake2
Architecture: amd64
CPU ID: GenuineIntel-000706e5-bfebfbff
SUPERCOP version: 20221005
Operation: crypto_scalarmult
Primitive: kummer
TimeObject sizeTest sizeImplementationCompilerBenchmark dateSUPERCOP version
19862814403 0 037914 772 1824avx2intgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
21074120900 0 044714 780 1824avx2intclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
21173818452 0 042194 780 1792avx2intclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
2147398670 0 030140 772 1824avx2intclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
22050017567 0 038178 780 1760avx2intclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
2219369112 0 030826 772 1824avx2intgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
2288268382 0 029122 772 1824avx2intgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
2310426100 0 025702 764 1792avx2intgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
2591429019 0 033090 780 1824avx2clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
2602118812 0 028134 764 1792avx2gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
2602919180 0 032458 772 1824avx2gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
2605618587 0 029378 780 1760avx2clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
2607179019 0 033002 780 1792avx2clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
2612928924 0 029418 772 1824avx2gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
2616988956 0 030426 772 1824avx2gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
2617868891 0 031522 780 1760avx2clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
2620098505 0 030140 772 1824avx2clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
3440209705 0 033642 780 1792avxclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
3442159866 0 033162 772 1824avxgcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
3442329705 0 033730 780 1824avxclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
3455129642 0 031130 772 1824avxgcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
3459429610 0 030058 772 1824avxgcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
3466879273 0 030018 780 1760avxclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
3473619498 0 028774 764 1792avxgcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
3492329191 0 030780 772 1824avxclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
3504879577 0 032162 780 1760avxclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
84104313797 0 037714 780 1824ref5uclang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
85377013943 0 036466 780 1760ref5clang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
85988913719 0 037618 780 1824ref5clang_-march=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
8655816303 0 029714 772 1824ref5gcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
87558112263 0 036114 780 1792ref5uclang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
88615312045 0 035874 780 1792ref5clang_-march=native_-O2_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
88867314185 0 036674 780 1760ref5uclang_-mcpu=native_-O3_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
8906934589 0 026202 772 1824ref5gcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
89479112243 0 032898 780 1760ref5clang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
9011864840 0 026490 772 1824ref5ugcc_-march=native_-mtune=native_-O2_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
9084184480 0 025996 772 1824ref5uclang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
9467564522 0 026020 772 1824ref5clang_-march=native_-Os_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
97949712477 0 033106 780 1760ref5uclang_-march=native_-O_-fomit-frame-pointer_-fwrapv_-Qunused-arguments_-fPIC_-fPIE2022100420220506
10076176748 0 030218 772 1824ref5ugcc_-march=native_-mtune=native_-O3_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
10099964257 0 023718 764 1792ref5gcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
10379474572 0 025170 772 1824ref5gcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
10611294561 0 024054 764 1792ref5ugcc_-march=native_-mtune=native_-Os_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506
10826684862 0 025490 772 1824ref5ugcc_-march=native_-mtune=native_-O_-fomit-frame-pointer_-fwrapv_-fPIC_-fPIE2022100420220506

Compiler output

Implementation: avx2int
Security model: constbranchindex
Compiler: clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE
smult.c: smult.c:36:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t0 = _mm256_mul_epi32(a->v[0],*b);
smult.c: ^
smult.c: smult.c:36:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:37:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t1 = _mm256_mul_epi32(a->v[1],*b);
smult.c: ^
smult.c: smult.c:37:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:38:30: error: always_inline function '_mm256_srli_epi64' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t1 = _mm256_add_epi64(t1,_mm256_srli_epi64(t0,26)); t0 &= mask26;
smult.c: ^
smult.c: smult.c:38:30: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:38:10: error: always_inline function '_mm256_add_epi64' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t1 = _mm256_add_epi64(t1,_mm256_srli_epi64(t0,26)); t0 &= mask26;
smult.c: ^
smult.c: smult.c:38:10: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:39:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t2 = _mm256_mul_epi32(a->v[2],*b);
smult.c: ^
smult.c: smult.c:39:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:40:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: t3 = _mm256_mul_epi32(a->v[3],*b);
smult.c: ^
smult.c: smult.c:40:8: error: AVX vector argument of type '__m256i' (vector of 4 'long long' values) without 'avx' enabled changes the ABI
smult.c: smult.c:41:8: error: always_inline function '_mm256_mul_epi32' requires target feature 'avx2', but would be inlined into function 'crypto_scalarmult_kummer_avx2int_constbranchindex_gfe4x_mulconst' that is compiled without support for 'avx2'
smult.c: ...

Number of similar (compiler,implementation) pairs: 1, namely:
CompilerImplementations
clang -mcpu=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments -fPIC -fPIE avx2int