Implementation notes: amd64, glyme, crypto_aead/pi16cipher096v1

Computer: glyme
Architecture: amd64
CPU ID: GenuineIntel-00020652-bfebfbff
SUPERCOP version: 201720170105
Operation: crypto_aead
Primitive: pi16cipher096v1
TimeImplementationCompilerBenchmark dateSUPERCOP version
502332optimized_nonSSEgcc -funroll-loops -m64 -march=barcelona -O3 -fomit-frame-pointer2017020420170105
503800optimized_nonSSEgcc -funroll-loops -march=k8 -O3 -fomit-frame-pointer2017020420170105
504812optimized_nonSSEgcc -funroll-loops -m64 -O3 -fomit-frame-pointer2017020420170105
504824optimized_nonSSEgcc -m64 -march=barcelona -O3 -fomit-frame-pointer2017020420170105
505040optimized_nonSSEgcc -m64 -march=core2 -msse4 -O3 -fomit-frame-pointer2017020420170105
505644optimized_nonSSEgcc -m64 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2017020420170105
505692optimized_nonSSEgcc -m64 -march=corei7 -O3 -fomit-frame-pointer2017020420170105
506028optimized_nonSSEgcc -funroll-loops -O3 -fomit-frame-pointer2017020420170105
506492optimized_nonSSEgcc -m64 -march=k8 -O3 -fomit-frame-pointer2017020420170105
507128optimized_nonSSEgcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer2017020420170105
507152optimized_nonSSEgcc -m64 -O3 -fomit-frame-pointer2017020420170105
507176optimized_nonSSEgcc -funroll-loops -march=barcelona -O3 -fomit-frame-pointer2017020420170105
507744optimized_nonSSEgcc -funroll-loops -m64 -march=k8 -O3 -fomit-frame-pointer2017020420170105
508068optimized_nonSSEgcc -funroll-loops -march=nocona -O3 -fomit-frame-pointer2017020420170105
508584optimized_nonSSEgcc -march=barcelona -O3 -fomit-frame-pointer2017020420170105
511140optimized_nonSSEgcc -m64 -march=native -mtune=native -O3 -fomit-frame-pointer2017020420170105
512048optimized_nonSSEgcc -m64 -march=core2 -O3 -fomit-frame-pointer2017020420170105
512400optimized_nonSSEgcc -march=nocona -O3 -fomit-frame-pointer2017020420170105
512404optimized_nonSSEgcc -fno-schedule-insns -O3 -fomit-frame-pointer2017020420170105
513552optimized_nonSSEgcc -O3 -fomit-frame-pointer2017020420170105
513560optimized_nonSSEgcc -funroll-loops -m64 -march=nocona -O3 -fomit-frame-pointer2017020420170105
518172optimized_nonSSEgcc -m64 -march=nocona -O3 -fomit-frame-pointer2017020420170105
520668optimized_nonSSEgcc -funroll-loops -m64 -march=barcelona -O2 -fomit-frame-pointer2017020420170105
520768optimized_nonSSEgcc -funroll-loops -m64 -march=k8 -O2 -fomit-frame-pointer2017020420170105
521648optimized_nonSSEgcc -funroll-loops -march=barcelona -O2 -fomit-frame-pointer2017020420170105
522584optimized_nonSSEgcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer2017020420170105
523124optimized_nonSSEgcc -funroll-loops -march=k8 -O2 -fomit-frame-pointer2017020420170105
525684optimized_nonSSEgcc -funroll-loops -O2 -fomit-frame-pointer2017020420170105
525816optimized_nonSSEgcc -funroll-loops -m64 -O2 -fomit-frame-pointer2017020420170105
526644optimized_nonSSEgcc -march=k8 -O3 -fomit-frame-pointer2017020420170105
527356optimized_nonSSEgcc -funroll-loops -march=nocona -O2 -fomit-frame-pointer2017020420170105
527796optimized_nonSSEgcc -funroll-loops -m64 -march=nocona -O2 -fomit-frame-pointer2017020420170105
552004optimized_nonSSEgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2017020420170105
556612optimized_nonSSEgcc -funroll-loops -m64 -march=barcelona -O -fomit-frame-pointer2017020420170105
557056optimized_nonSSEgcc -funroll-loops -march=barcelona -O -fomit-frame-pointer2017020420170105
557684optimized_nonSSEgcc -funroll-loops -m64 -march=k8 -O -fomit-frame-pointer2017020420170105
557684optimized_nonSSEgcc -funroll-loops -march=k8 -O -fomit-frame-pointer2017020420170105
558336optimized_nonSSEgcc -funroll-loops -m64 -march=nocona -O -fomit-frame-pointer2017020420170105
559028optimized_nonSSEgcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer2017020420170105
561660optimized_nonSSEgcc -funroll-loops -march=nocona -O -fomit-frame-pointer2017020420170105
562824optimized_nonSSEgcc -funroll-loops -O -fomit-frame-pointer2017020420170105
567488optimized_nonSSEgcc -funroll-loops -m64 -O -fomit-frame-pointer2017020420170105
760652optimized_nonSSEgcc -m64 -march=core2 -msse4.1 -O -fomit-frame-pointer2017020420170105
762804optimized_nonSSEgcc -m64 -march=core2 -msse4 -O -fomit-frame-pointer2017020420170105
764748optimized_nonSSEgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2017020420170105
766192optimized_nonSSEgcc -m64 -march=native -mtune=native -O -fomit-frame-pointer2017020420170105
767464optimized_nonSSEgcc -m64 -march=barcelona -O -fomit-frame-pointer2017020420170105
767856optimized_nonSSEgcc -O -fomit-frame-pointer2017020420170105
768340optimized_nonSSEgcc -march=k8 -O -fomit-frame-pointer2017020420170105
770476optimized_nonSSEgcc -m64 -O -fomit-frame-pointer2017020420170105
770824optimized_nonSSEgcc -m64 -march=core2 -O -fomit-frame-pointer2017020420170105
774448optimized_nonSSEgcc -m64 -march=nocona -O -fomit-frame-pointer2017020420170105
776032optimized_nonSSEgcc -m64 -march=corei7 -O -fomit-frame-pointer2017020420170105
776148optimized_nonSSEgcc -march=nocona -O -fomit-frame-pointer2017020420170105
777904optimized_nonSSEgcc -m64 -march=k8 -O -fomit-frame-pointer2017020420170105
781368optimized_nonSSEgcc -march=barcelona -O -fomit-frame-pointer2017020420170105
787000optimized_nonSSEgcc -fno-schedule-insns -O -fomit-frame-pointer2017020420170105
836564optimized_nonSSEgcc -march=barcelona -O2 -fomit-frame-pointer2017020420170105
837428optimized_nonSSEgcc -m64 -march=k8 -O2 -fomit-frame-pointer2017020420170105
838024optimized_nonSSEgcc -m64 -march=barcelona -O2 -fomit-frame-pointer2017020420170105
838360optimized_nonSSEgcc -march=k8 -O2 -fomit-frame-pointer2017020420170105
870604optimized_nonSSEgcc -m64 -march=native -mtune=native -O2 -fomit-frame-pointer2017020420170105
870716optimized_nonSSEgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2017020420170105
872316optimized_nonSSEgcc -m64 -march=corei7 -O2 -fomit-frame-pointer2017020420170105
875296optimized_nonSSEgcc -m64 -march=core2 -msse4 -O2 -fomit-frame-pointer2017020420170105
876000optimized_nonSSEgcc -m64 -march=core2 -O2 -fomit-frame-pointer2017020420170105
876172optimized_nonSSEgcc -O2 -fomit-frame-pointer2017020420170105
876356optimized_nonSSEgcc -fno-schedule-insns -O2 -fomit-frame-pointer2017020420170105
878804optimized_nonSSEgcc -m64 -O2 -fomit-frame-pointer2017020420170105
880852optimized_nonSSEgcc -m64 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2017020420170105
896580optimized_nonSSEgcc -march=barcelona -Os -fomit-frame-pointer2017020420170105
896844optimized_nonSSEgcc -march=nocona -O2 -fomit-frame-pointer2017020420170105
897260optimized_nonSSEgcc -m64 -march=barcelona -Os -fomit-frame-pointer2017020420170105
899464optimized_nonSSEgcc -m64 -march=core2 -msse4 -Os -fomit-frame-pointer2017020420170105
899644optimized_nonSSEgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2017020420170105
899688optimized_nonSSEgcc -m64 -march=native -mtune=native -Os -fomit-frame-pointer2017020420170105
900388optimized_nonSSEgcc -fno-schedule-insns -Os -fomit-frame-pointer2017020420170105
900408optimized_nonSSEgcc -march=k8 -Os -fomit-frame-pointer2017020420170105
900452optimized_nonSSEgcc -m64 -march=k8 -Os -fomit-frame-pointer2017020420170105
901204optimized_nonSSEgcc -m64 -march=nocona -O2 -fomit-frame-pointer2017020420170105
902100optimized_nonSSEgcc -m64 -march=core2 -msse4.1 -Os -fomit-frame-pointer2017020420170105
902768optimized_nonSSEgcc -m64 -march=core2 -Os -fomit-frame-pointer2017020420170105
902860optimized_nonSSEgcc -m64 -march=corei7 -Os -fomit-frame-pointer2017020420170105
903408optimized_nonSSEgcc -m64 -Os -fomit-frame-pointer2017020420170105
903464optimized_nonSSEgcc -Os -fomit-frame-pointer2017020420170105
904956optimized_nonSSEgcc -funroll-loops -m64 -march=k8 -Os -fomit-frame-pointer2017020420170105
908644optimized_nonSSEgcc -funroll-loops -march=k8 -Os -fomit-frame-pointer2017020420170105
908788optimized_nonSSEgcc -funroll-loops -Os -fomit-frame-pointer2017020420170105
909352optimized_nonSSEgcc -funroll-loops -m64 -march=barcelona -Os -fomit-frame-pointer2017020420170105
913300optimized_nonSSEgcc -funroll-loops -march=barcelona -Os -fomit-frame-pointer2017020420170105
915180refgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2017020420170105
916232optimized_nonSSEgcc -funroll-loops -m64 -Os -fomit-frame-pointer2017020420170105
918440optimized_nonSSEgcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer2017020420170105
921444optimized_nonSSEgcc -funroll-loops -march=nocona -Os -fomit-frame-pointer2017020420170105
924292optimized_nonSSEgcc -m64 -march=nocona -Os -fomit-frame-pointer2017020420170105
925816refgcc -funroll-loops -m64 -march=nocona -O3 -fomit-frame-pointer2017020420170105
927824refgcc -funroll-loops -march=nocona -O3 -fomit-frame-pointer2017020420170105
928788optimized_nonSSEgcc -march=nocona -Os -fomit-frame-pointer2017020420170105
928872refgcc -m64 -march=corei7 -O3 -fomit-frame-pointer2017020420170105
928876refgcc -m64 -march=native -mtune=native -O3 -fomit-frame-pointer2017020420170105
929432refgcc -m64 -march=core2 -msse4 -O3 -fomit-frame-pointer2017020420170105
929456refgcc -m64 -march=core2 -O3 -fomit-frame-pointer2017020420170105
932476refgcc -m64 -march=nocona -O3 -fomit-frame-pointer2017020420170105
932992refgcc -m64 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2017020420170105
933496refgcc -m64 -O3 -fomit-frame-pointer2017020420170105
933500refgcc -O3 -fomit-frame-pointer2017020420170105
933500refgcc -fno-schedule-insns -O3 -fomit-frame-pointer2017020420170105
933664optimized_nonSSEgcc -funroll-loops -m64 -march=nocona -Os -fomit-frame-pointer2017020420170105
935820refgcc -march=nocona -O3 -fomit-frame-pointer2017020420170105
935836refgcc -funroll-loops -m64 -march=barcelona -O3 -fomit-frame-pointer2017020420170105
936816refgcc -funroll-loops -O3 -fomit-frame-pointer2017020420170105
936872refgcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer2017020420170105
938044refgcc -m64 -march=barcelona -O3 -fomit-frame-pointer2017020420170105
938048refgcc -march=barcelona -O3 -fomit-frame-pointer2017020420170105
939628refgcc -funroll-loops -m64 -O3 -fomit-frame-pointer2017020420170105
940124refgcc -funroll-loops -march=barcelona -O3 -fomit-frame-pointer2017020420170105
941052refgcc -m64 -march=k8 -O3 -fomit-frame-pointer2017020420170105
941060refgcc -march=k8 -O3 -fomit-frame-pointer2017020420170105
942124refgcc -funroll-loops -march=k8 -O3 -fomit-frame-pointer2017020420170105
943716refgcc -funroll-loops -m64 -march=k8 -O3 -fomit-frame-pointer2017020420170105
950424refgcc -funroll-loops -m64 -march=k8 -O -fomit-frame-pointer2017020420170105
952324refgcc -funroll-loops -march=k8 -O -fomit-frame-pointer2017020420170105
952956refgcc -funroll-loops -march=barcelona -O -fomit-frame-pointer2017020420170105
954468refgcc -funroll-loops -m64 -march=nocona -O -fomit-frame-pointer2017020420170105
954972refgcc -funroll-loops -march=nocona -O -fomit-frame-pointer2017020420170105
955100refgcc -funroll-loops -m64 -O -fomit-frame-pointer2017020420170105
956360refgcc -funroll-loops -m64 -march=barcelona -O -fomit-frame-pointer2017020420170105
956628refgcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer2017020420170105
956768refgcc -funroll-loops -O -fomit-frame-pointer2017020420170105
1003676refgcc -funroll-loops -m64 -march=nocona -O2 -fomit-frame-pointer2017020420170105
1003868refgcc -funroll-loops -march=nocona -O2 -fomit-frame-pointer2017020420170105
1031012refgcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer2017020420170105
1033636refgcc -funroll-loops -O2 -fomit-frame-pointer2017020420170105
1034972refgcc -funroll-loops -m64 -O2 -fomit-frame-pointer2017020420170105
1043384refgcc -funroll-loops -m64 -march=k8 -O2 -fomit-frame-pointer2017020420170105
1043964refgcc -funroll-loops -march=barcelona -O2 -fomit-frame-pointer2017020420170105
1048020refgcc -funroll-loops -m64 -march=barcelona -O2 -fomit-frame-pointer2017020420170105
1053164refgcc -funroll-loops -march=k8 -O2 -fomit-frame-pointer2017020420170105
1056164refgcc -m64 -march=native -mtune=native -O -fomit-frame-pointer2017020420170105
1056292refgcc -m64 -march=corei7 -O -fomit-frame-pointer2017020420170105
1057012refgcc -m64 -march=core2 -msse4 -O -fomit-frame-pointer2017020420170105
1057016refgcc -m64 -march=core2 -O -fomit-frame-pointer2017020420170105
1057040refgcc -m64 -march=core2 -msse4.1 -O -fomit-frame-pointer2017020420170105
1058256refgcc -m64 -march=barcelona -O -fomit-frame-pointer2017020420170105
1058844refgcc -O -fomit-frame-pointer2017020420170105
1058864refgcc -fno-schedule-insns -O -fomit-frame-pointer2017020420170105
1058880refgcc -m64 -O -fomit-frame-pointer2017020420170105
1058976refgcc -m64 -march=k8 -O -fomit-frame-pointer2017020420170105
1060224refgcc -m64 -march=nocona -O -fomit-frame-pointer2017020420170105
1063072refgcc -march=barcelona -O -fomit-frame-pointer2017020420170105
1070136refgcc -march=nocona -O -fomit-frame-pointer2017020420170105
1071520refgcc -march=k8 -O -fomit-frame-pointer2017020420170105
1084472refgcc -m64 -march=nocona -O2 -fomit-frame-pointer2017020420170105
1084920refgcc -march=nocona -O2 -fomit-frame-pointer2017020420170105
1085468refgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2017020420170105
1086268refgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2017020420170105
1100428refgcc -m64 -march=corei7 -O2 -fomit-frame-pointer2017020420170105
1101176refgcc -m64 -march=core2 -O2 -fomit-frame-pointer2017020420170105
1101488refgcc -m64 -march=native -mtune=native -O2 -fomit-frame-pointer2017020420170105
1102476refgcc -m64 -march=core2 -msse4 -O2 -fomit-frame-pointer2017020420170105
1102704refgcc -m64 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2017020420170105
1129588refgcc -O2 -fomit-frame-pointer2017020420170105
1133600refgcc -m64 -O2 -fomit-frame-pointer2017020420170105
1134860refgcc -fno-schedule-insns -O2 -fomit-frame-pointer2017020420170105
1138788refgcc -march=k8 -O2 -fomit-frame-pointer2017020420170105
1143852refgcc -m64 -march=k8 -O2 -fomit-frame-pointer2017020420170105
1160540refgcc -march=barcelona -O2 -fomit-frame-pointer2017020420170105
1167776refgcc -m64 -march=barcelona -O2 -fomit-frame-pointer2017020420170105
1620160refgcc -m64 -march=core2 -msse4.1 -Os -fomit-frame-pointer2017020420170105
1622492refgcc -m64 -march=core2 -msse4 -Os -fomit-frame-pointer2017020420170105
1622928refgcc -m64 -march=corei7 -Os -fomit-frame-pointer2017020420170105
1627476refgcc -m64 -march=native -mtune=native -Os -fomit-frame-pointer2017020420170105
1630672refgcc -m64 -march=core2 -Os -fomit-frame-pointer2017020420170105
1632288refgcc -m64 -march=barcelona -Os -fomit-frame-pointer2017020420170105
1635696refgcc -funroll-loops -march=barcelona -Os -fomit-frame-pointer2017020420170105
1635704refgcc -funroll-loops -m64 -march=barcelona -Os -fomit-frame-pointer2017020420170105
1636540refgcc -march=barcelona -Os -fomit-frame-pointer2017020420170105
1638836refgcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer2017020420170105
1639476refgcc -funroll-loops -m64 -Os -fomit-frame-pointer2017020420170105
1640216refgcc -funroll-loops -m64 -march=k8 -Os -fomit-frame-pointer2017020420170105
1641828refgcc -funroll-loops -Os -fomit-frame-pointer2017020420170105
1642080refgcc -funroll-loops -march=k8 -Os -fomit-frame-pointer2017020420170105
1642516refgcc -m64 -march=nocona -Os -fomit-frame-pointer2017020420170105
1643304refgcc -funroll-loops -m64 -march=nocona -Os -fomit-frame-pointer2017020420170105
1649476refgcc -m64 -march=k8 -Os -fomit-frame-pointer2017020420170105
1650872refgcc -Os -fomit-frame-pointer2017020420170105
1650952refgcc -march=k8 -Os -fomit-frame-pointer2017020420170105
1651304refgcc -fno-schedule-insns -Os -fomit-frame-pointer2017020420170105
1651376refgcc -m64 -Os -fomit-frame-pointer2017020420170105
1651916refgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2017020420170105
1651972refgcc -funroll-loops -march=nocona -Os -fomit-frame-pointer2017020420170105
1653504refgcc -march=nocona -Os -fomit-frame-pointer2017020420170105
3089672optimized_nonSSEgcc2017020420170105
3100920optimized_nonSSEgcc -funroll-loops2017020420170105
3735032refgcc -funroll-loops2017020420170105
3749668refgcc2017020420170105

Test failure

Implementation: crypto_aead/pi16cipher096v1/optimized_nonSSE
Compiler: cc
error 111
crypto_aead_decrypt returns nonzero

Number of similar (compiler,implementation) pairs: 14, namely:
CompilerImplementations
cc optimized_nonSSE ref
clang -O3 -fomit-frame-pointer -Qunused-arguments optimized_nonSSE ref
clang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments optimized_nonSSE ref
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments optimized_nonSSE ref
clang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments optimized_nonSSE ref
clang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments optimized_nonSSE ref
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments optimized_nonSSE ref

Compiler output

Implementation: crypto_aead/pi16cipher096v1/ref
Compiler: cc
encrypt.c: encrypt.c:248:42: warning: unsequenced modification and access to 'i1' [-Wunsequenced]
encrypt.c: InternalState8[i1] = InternalState8[i1++] ^ ad[b+i];
encrypt.c: ~~ ^
encrypt.c: encrypt.c:374:68: warning: unsequenced modification and access to 'i1' [-Wunsequenced]
encrypt.c: c[CRYPTO_NSECBYTES+b+i] = InternalState8[i1] = InternalState8[i1++] ^ m[b+i];
encrypt.c: ~~ ^
encrypt.c: encrypt.c:536:42: warning: unsequenced modification and access to 'i1' [-Wunsequenced]
encrypt.c: InternalState8[i1] = InternalState8[i1++] ^ ad[b+i];
encrypt.c: ~~ ^
encrypt.c: 3 warnings generated.

Number of similar (compiler,implementation) pairs: 7, namely:
CompilerImplementations
cc ref
clang -O3 -fomit-frame-pointer -Qunused-arguments ref
clang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments ref
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments ref
clang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments ref
clang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments ref
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments ref

Compiler output

Implementation: crypto_aead/pi16cipher096v1/optimized_nonSSE
Compiler: cc
encrypt.c: encrypt.c:362:42: warning: unsequenced modification and access to 'i1' [-Wunsequenced]
encrypt.c: InternalState8[i1] = InternalState8[i1++] ^ ad[b+i];
encrypt.c: ~~ ^
encrypt.c: encrypt.c:488:68: warning: unsequenced modification and access to 'i1' [-Wunsequenced]
encrypt.c: c[CRYPTO_NSECBYTES+b+i] = InternalState8[i1] = InternalState8[i1++] ^ m[b+i];
encrypt.c: ~~ ^
encrypt.c: encrypt.c:650:42: warning: unsequenced modification and access to 'i1' [-Wunsequenced]
encrypt.c: InternalState8[i1] = InternalState8[i1++] ^ ad[b+i];
encrypt.c: ~~ ^
encrypt.c: 3 warnings generated.

Number of similar (compiler,implementation) pairs: 7, namely:
CompilerImplementations
cc optimized_nonSSE
clang -O3 -fomit-frame-pointer -Qunused-arguments optimized_nonSSE
clang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments optimized_nonSSE
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments optimized_nonSSE
clang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments optimized_nonSSE
clang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments optimized_nonSSE
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments optimized_nonSSE