Implementation notes: amd64, glyme, crypto_aead/pi16cipher128v1

Computer: glyme
Architecture: amd64
CPU ID: GenuineIntel-00020652-bfebfbff
SUPERCOP version: 201720170105
Operation: crypto_aead
Primitive: pi16cipher128v1
TimeImplementationCompilerBenchmark dateSUPERCOP version
502016optimized_nonSSEgcc -funroll-loops -m64 -march=barcelona -O3 -fomit-frame-pointer2017020420170105
502300optimized_nonSSEgcc -funroll-loops -m64 -march=k8 -O3 -fomit-frame-pointer2017020420170105
503436optimized_nonSSEgcc -funroll-loops -march=barcelona -O3 -fomit-frame-pointer2017020420170105
504476optimized_nonSSEgcc -funroll-loops -O3 -fomit-frame-pointer2017020420170105
504696optimized_nonSSEgcc -m64 -march=core2 -O3 -fomit-frame-pointer2017020420170105
504768optimized_nonSSEgcc -m64 -march=native -mtune=native -O3 -fomit-frame-pointer2017020420170105
504772optimized_nonSSEgcc -funroll-loops -march=k8 -O3 -fomit-frame-pointer2017020420170105
505096optimized_nonSSEgcc -m64 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2017020420170105
505368optimized_nonSSEgcc -march=barcelona -O3 -fomit-frame-pointer2017020420170105
506092optimized_nonSSEgcc -m64 -march=k8 -O3 -fomit-frame-pointer2017020420170105
506824optimized_nonSSEgcc -m64 -march=barcelona -O3 -fomit-frame-pointer2017020420170105
506924optimized_nonSSEgcc -fno-schedule-insns -O3 -fomit-frame-pointer2017020420170105
506948optimized_nonSSEgcc -march=k8 -O3 -fomit-frame-pointer2017020420170105
507436optimized_nonSSEgcc -O3 -fomit-frame-pointer2017020420170105
507556optimized_nonSSEgcc -funroll-loops -march=nocona -O3 -fomit-frame-pointer2017020420170105
508576optimized_nonSSEgcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer2017020420170105
509940optimized_nonSSEgcc -m64 -march=corei7 -O3 -fomit-frame-pointer2017020420170105
510236optimized_nonSSEgcc -m64 -O3 -fomit-frame-pointer2017020420170105
511584optimized_nonSSEgcc -m64 -march=core2 -msse4 -O3 -fomit-frame-pointer2017020420170105
511676optimized_nonSSEgcc -funroll-loops -m64 -O3 -fomit-frame-pointer2017020420170105
512360optimized_nonSSEgcc -march=nocona -O3 -fomit-frame-pointer2017020420170105
513548optimized_nonSSEgcc -funroll-loops -m64 -march=nocona -O3 -fomit-frame-pointer2017020420170105
515484optimized_nonSSEgcc -m64 -march=nocona -O3 -fomit-frame-pointer2017020420170105
519836optimized_nonSSEgcc -funroll-loops -m64 -march=barcelona -O2 -fomit-frame-pointer2017020420170105
521040optimized_nonSSEgcc -funroll-loops -m64 -march=k8 -O2 -fomit-frame-pointer2017020420170105
522064optimized_nonSSEgcc -funroll-loops -march=k8 -O2 -fomit-frame-pointer2017020420170105
522520optimized_nonSSEgcc -funroll-loops -march=barcelona -O2 -fomit-frame-pointer2017020420170105
524664optimized_nonSSEgcc -funroll-loops -m64 -O2 -fomit-frame-pointer2017020420170105
525268optimized_nonSSEgcc -funroll-loops -O2 -fomit-frame-pointer2017020420170105
526080optimized_nonSSEgcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer2017020420170105
529560optimized_nonSSEgcc -funroll-loops -m64 -march=nocona -O2 -fomit-frame-pointer2017020420170105
533764optimized_nonSSEgcc -funroll-loops -march=nocona -O2 -fomit-frame-pointer2017020420170105
554048optimized_nonSSEgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2017020420170105
557680optimized_nonSSEgcc -funroll-loops -march=k8 -O -fomit-frame-pointer2017020420170105
557920optimized_nonSSEgcc -funroll-loops -march=barcelona -O -fomit-frame-pointer2017020420170105
557996optimized_nonSSEgcc -funroll-loops -m64 -march=k8 -O -fomit-frame-pointer2017020420170105
558676optimized_nonSSEgcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer2017020420170105
559344optimized_nonSSEgcc -funroll-loops -march=nocona -O -fomit-frame-pointer2017020420170105
560916optimized_nonSSEgcc -funroll-loops -m64 -march=nocona -O -fomit-frame-pointer2017020420170105
562676optimized_nonSSEgcc -funroll-loops -m64 -march=barcelona -O -fomit-frame-pointer2017020420170105
563900optimized_nonSSEgcc -funroll-loops -m64 -O -fomit-frame-pointer2017020420170105
579212optimized_nonSSEgcc -funroll-loops -O -fomit-frame-pointer2017020420170105
763856optimized_nonSSEgcc -m64 -O -fomit-frame-pointer2017020420170105
764392optimized_nonSSEgcc -m64 -march=core2 -msse4.1 -O -fomit-frame-pointer2017020420170105
764772optimized_nonSSEgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2017020420170105
765380optimized_nonSSEgcc -O -fomit-frame-pointer2017020420170105
769604optimized_nonSSEgcc -march=k8 -O -fomit-frame-pointer2017020420170105
770996optimized_nonSSEgcc -m64 -march=native -mtune=native -O -fomit-frame-pointer2017020420170105
771772optimized_nonSSEgcc -m64 -march=core2 -O -fomit-frame-pointer2017020420170105
772768optimized_nonSSEgcc -march=nocona -O -fomit-frame-pointer2017020420170105
773932optimized_nonSSEgcc -m64 -march=nocona -O -fomit-frame-pointer2017020420170105
774372optimized_nonSSEgcc -m64 -march=corei7 -O -fomit-frame-pointer2017020420170105
775472optimized_nonSSEgcc -m64 -march=core2 -msse4 -O -fomit-frame-pointer2017020420170105
775716optimized_nonSSEgcc -m64 -march=barcelona -O -fomit-frame-pointer2017020420170105
779984optimized_nonSSEgcc -m64 -march=k8 -O -fomit-frame-pointer2017020420170105
780128optimized_nonSSEgcc -fno-schedule-insns -O -fomit-frame-pointer2017020420170105
783832optimized_nonSSEgcc -march=barcelona -O -fomit-frame-pointer2017020420170105
836536optimized_nonSSEgcc -m64 -march=barcelona -O2 -fomit-frame-pointer2017020420170105
836572optimized_nonSSEgcc -march=barcelona -O2 -fomit-frame-pointer2017020420170105
838464optimized_nonSSEgcc -m64 -march=k8 -O2 -fomit-frame-pointer2017020420170105
838860optimized_nonSSEgcc -march=k8 -O2 -fomit-frame-pointer2017020420170105
870732optimized_nonSSEgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2017020420170105
870832optimized_nonSSEgcc -m64 -march=native -mtune=native -O2 -fomit-frame-pointer2017020420170105
871292optimized_nonSSEgcc -m64 -march=corei7 -O2 -fomit-frame-pointer2017020420170105
875376optimized_nonSSEgcc -m64 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2017020420170105
875388optimized_nonSSEgcc -m64 -march=core2 -msse4 -O2 -fomit-frame-pointer2017020420170105
876136optimized_nonSSEgcc -O2 -fomit-frame-pointer2017020420170105
876168optimized_nonSSEgcc -fno-schedule-insns -O2 -fomit-frame-pointer2017020420170105
877384optimized_nonSSEgcc -m64 -O2 -fomit-frame-pointer2017020420170105
878348optimized_nonSSEgcc -m64 -march=core2 -O2 -fomit-frame-pointer2017020420170105
891376optimized_nonSSEgcc -march=barcelona -Os -fomit-frame-pointer2017020420170105
891420optimized_nonSSEgcc -m64 -march=barcelona -Os -fomit-frame-pointer2017020420170105
894520optimized_nonSSEgcc -m64 -march=core2 -msse4 -Os -fomit-frame-pointer2017020420170105
894836optimized_nonSSEgcc -m64 -march=corei7 -Os -fomit-frame-pointer2017020420170105
895800optimized_nonSSEgcc -m64 -march=nocona -O2 -fomit-frame-pointer2017020420170105
896908optimized_nonSSEgcc -fno-schedule-insns -Os -fomit-frame-pointer2017020420170105
896944optimized_nonSSEgcc -m64 -Os -fomit-frame-pointer2017020420170105
898640optimized_nonSSEgcc -march=k8 -Os -fomit-frame-pointer2017020420170105
899044optimized_nonSSEgcc -m64 -march=native -mtune=native -Os -fomit-frame-pointer2017020420170105
900440optimized_nonSSEgcc -march=nocona -O2 -fomit-frame-pointer2017020420170105
902348optimized_nonSSEgcc -m64 -march=core2 -msse4.1 -Os -fomit-frame-pointer2017020420170105
902476optimized_nonSSEgcc -funroll-loops -m64 -march=barcelona -Os -fomit-frame-pointer2017020420170105
906304optimized_nonSSEgcc -funroll-loops -m64 -march=k8 -Os -fomit-frame-pointer2017020420170105
906932optimized_nonSSEgcc -Os -fomit-frame-pointer2017020420170105
907880optimized_nonSSEgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2017020420170105
907936optimized_nonSSEgcc -m64 -march=k8 -Os -fomit-frame-pointer2017020420170105
908292optimized_nonSSEgcc -funroll-loops -march=k8 -Os -fomit-frame-pointer2017020420170105
909116optimized_nonSSEgcc -funroll-loops -Os -fomit-frame-pointer2017020420170105
909616optimized_nonSSEgcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer2017020420170105
913248refgcc -march=native -mtune=native -O3 -fomit-frame-pointer -fwrapv2017020420170105
913524optimized_nonSSEgcc -m64 -march=core2 -Os -fomit-frame-pointer2017020420170105
918596optimized_nonSSEgcc -funroll-loops -march=barcelona -Os -fomit-frame-pointer2017020420170105
920472optimized_nonSSEgcc -m64 -march=nocona -Os -fomit-frame-pointer2017020420170105
922000optimized_nonSSEgcc -funroll-loops -march=nocona -Os -fomit-frame-pointer2017020420170105
924392optimized_nonSSEgcc -funroll-loops -m64 -Os -fomit-frame-pointer2017020420170105
925668refgcc -funroll-loops -m64 -march=nocona -O3 -fomit-frame-pointer2017020420170105
926392optimized_nonSSEgcc -march=nocona -Os -fomit-frame-pointer2017020420170105
927080refgcc -funroll-loops -march=nocona -O3 -fomit-frame-pointer2017020420170105
928864refgcc -m64 -march=core2 -O3 -fomit-frame-pointer2017020420170105
929420refgcc -m64 -march=corei7 -O3 -fomit-frame-pointer2017020420170105
929440refgcc -m64 -march=native -mtune=native -O3 -fomit-frame-pointer2017020420170105
930528refgcc -m64 -march=core2 -msse4.1 -O3 -fomit-frame-pointer2017020420170105
930572refgcc -funroll-loops -m64 -march=barcelona -O3 -fomit-frame-pointer2017020420170105
930692refgcc -m64 -march=core2 -msse4 -O3 -fomit-frame-pointer2017020420170105
932892refgcc -funroll-loops -march=barcelona -O3 -fomit-frame-pointer2017020420170105
932956refgcc -m64 -march=nocona -O3 -fomit-frame-pointer2017020420170105
933420refgcc -O3 -fomit-frame-pointer2017020420170105
933424refgcc -fno-schedule-insns -O3 -fomit-frame-pointer2017020420170105
934096refgcc -march=nocona -O3 -fomit-frame-pointer2017020420170105
935688refgcc -funroll-loops -O3 -fomit-frame-pointer2017020420170105
936080refgcc -funroll-loops -fno-schedule-insns -O3 -fomit-frame-pointer2017020420170105
938684refgcc -m64 -march=barcelona -O3 -fomit-frame-pointer2017020420170105
938688refgcc -march=barcelona -O3 -fomit-frame-pointer2017020420170105
939540refgcc -funroll-loops -m64 -march=k8 -O3 -fomit-frame-pointer2017020420170105
939556refgcc -funroll-loops -march=k8 -O3 -fomit-frame-pointer2017020420170105
940204refgcc -m64 -O3 -fomit-frame-pointer2017020420170105
940300refgcc -funroll-loops -m64 -O3 -fomit-frame-pointer2017020420170105
941348refgcc -m64 -march=k8 -O3 -fomit-frame-pointer2017020420170105
941352refgcc -march=k8 -O3 -fomit-frame-pointer2017020420170105
944740optimized_nonSSEgcc -funroll-loops -m64 -march=nocona -Os -fomit-frame-pointer2017020420170105
948424refgcc -funroll-loops -O -fomit-frame-pointer2017020420170105
949280refgcc -funroll-loops -m64 -O -fomit-frame-pointer2017020420170105
951100refgcc -funroll-loops -m64 -march=k8 -O -fomit-frame-pointer2017020420170105
951424refgcc -funroll-loops -march=nocona -O -fomit-frame-pointer2017020420170105
952212refgcc -funroll-loops -march=barcelona -O -fomit-frame-pointer2017020420170105
952264refgcc -funroll-loops -fno-schedule-insns -O -fomit-frame-pointer2017020420170105
955244refgcc -funroll-loops -march=k8 -O -fomit-frame-pointer2017020420170105
955996refgcc -funroll-loops -m64 -march=barcelona -O -fomit-frame-pointer2017020420170105
963284refgcc -funroll-loops -m64 -march=nocona -O -fomit-frame-pointer2017020420170105
1003852refgcc -funroll-loops -m64 -march=nocona -O2 -fomit-frame-pointer2017020420170105
1003964refgcc -funroll-loops -march=nocona -O2 -fomit-frame-pointer2017020420170105
1024352refgcc -funroll-loops -fno-schedule-insns -O2 -fomit-frame-pointer2017020420170105
1025968refgcc -funroll-loops -m64 -O2 -fomit-frame-pointer2017020420170105
1028020refgcc -funroll-loops -O2 -fomit-frame-pointer2017020420170105
1034092refgcc -funroll-loops -m64 -march=barcelona -O2 -fomit-frame-pointer2017020420170105
1037336refgcc -funroll-loops -march=barcelona -O2 -fomit-frame-pointer2017020420170105
1043920refgcc -funroll-loops -march=k8 -O2 -fomit-frame-pointer2017020420170105
1047444refgcc -funroll-loops -m64 -march=k8 -O2 -fomit-frame-pointer2017020420170105
1054932refgcc -m64 -march=corei7 -O -fomit-frame-pointer2017020420170105
1054996refgcc -m64 -march=native -mtune=native -O -fomit-frame-pointer2017020420170105
1057052refgcc -m64 -march=nocona -O -fomit-frame-pointer2017020420170105
1058060refgcc -m64 -march=core2 -O -fomit-frame-pointer2017020420170105
1058080refgcc -m64 -march=core2 -msse4.1 -O -fomit-frame-pointer2017020420170105
1058196refgcc -m64 -march=core2 -msse4 -O -fomit-frame-pointer2017020420170105
1058536refgcc -m64 -march=barcelona -O -fomit-frame-pointer2017020420170105
1059032refgcc -m64 -march=k8 -O -fomit-frame-pointer2017020420170105
1059200refgcc -march=barcelona -O -fomit-frame-pointer2017020420170105
1061648refgcc -fno-schedule-insns -O -fomit-frame-pointer2017020420170105
1061676refgcc -O -fomit-frame-pointer2017020420170105
1061696refgcc -m64 -O -fomit-frame-pointer2017020420170105
1063576refgcc -march=nocona -O -fomit-frame-pointer2017020420170105
1067488refgcc -march=k8 -O -fomit-frame-pointer2017020420170105
1078996refgcc -march=nocona -O2 -fomit-frame-pointer2017020420170105
1082900refgcc -m64 -march=nocona -O2 -fomit-frame-pointer2017020420170105
1086420refgcc -march=native -mtune=native -O -fomit-frame-pointer -fwrapv2017020420170105
1088176refgcc -march=native -mtune=native -O2 -fomit-frame-pointer -fwrapv2017020420170105
1097960refgcc -m64 -march=core2 -O2 -fomit-frame-pointer2017020420170105
1101080refgcc -m64 -march=core2 -msse4 -O2 -fomit-frame-pointer2017020420170105
1101128refgcc -m64 -march=corei7 -O2 -fomit-frame-pointer2017020420170105
1103876refgcc -m64 -march=native -mtune=native -O2 -fomit-frame-pointer2017020420170105
1104992refgcc -m64 -march=core2 -msse4.1 -O2 -fomit-frame-pointer2017020420170105
1135184refgcc -O2 -fomit-frame-pointer2017020420170105
1136228refgcc -m64 -O2 -fomit-frame-pointer2017020420170105
1139824refgcc -fno-schedule-insns -O2 -fomit-frame-pointer2017020420170105
1141448refgcc -m64 -march=k8 -O2 -fomit-frame-pointer2017020420170105
1141608refgcc -march=k8 -O2 -fomit-frame-pointer2017020420170105
1163292refgcc -march=barcelona -O2 -fomit-frame-pointer2017020420170105
1165356refgcc -m64 -march=barcelona -O2 -fomit-frame-pointer2017020420170105
1621848refgcc -m64 -march=core2 -msse4 -Os -fomit-frame-pointer2017020420170105
1621888refgcc -m64 -march=core2 -msse4.1 -Os -fomit-frame-pointer2017020420170105
1623864refgcc -m64 -march=core2 -Os -fomit-frame-pointer2017020420170105
1624912refgcc -m64 -march=corei7 -Os -fomit-frame-pointer2017020420170105
1627072refgcc -m64 -march=barcelona -Os -fomit-frame-pointer2017020420170105
1627876refgcc -m64 -march=native -mtune=native -Os -fomit-frame-pointer2017020420170105
1631356refgcc -funroll-loops -march=barcelona -Os -fomit-frame-pointer2017020420170105
1633368refgcc -funroll-loops -m64 -march=barcelona -Os -fomit-frame-pointer2017020420170105
1637092refgcc -march=barcelona -Os -fomit-frame-pointer2017020420170105
1639192refgcc -funroll-loops -m64 -Os -fomit-frame-pointer2017020420170105
1639292refgcc -funroll-loops -march=k8 -Os -fomit-frame-pointer2017020420170105
1639768refgcc -funroll-loops -fno-schedule-insns -Os -fomit-frame-pointer2017020420170105
1639848refgcc -funroll-loops -m64 -march=nocona -Os -fomit-frame-pointer2017020420170105
1640832refgcc -funroll-loops -march=nocona -Os -fomit-frame-pointer2017020420170105
1641908refgcc -march=nocona -Os -fomit-frame-pointer2017020420170105
1643748refgcc -m64 -march=nocona -Os -fomit-frame-pointer2017020420170105
1645060refgcc -m64 -Os -fomit-frame-pointer2017020420170105
1646832refgcc -funroll-loops -Os -fomit-frame-pointer2017020420170105
1647208refgcc -march=native -mtune=native -Os -fomit-frame-pointer -fwrapv2017020420170105
1647700refgcc -m64 -march=k8 -Os -fomit-frame-pointer2017020420170105
1649116refgcc -fno-schedule-insns -Os -fomit-frame-pointer2017020420170105
1649676refgcc -funroll-loops -m64 -march=k8 -Os -fomit-frame-pointer2017020420170105
1649840refgcc -march=k8 -Os -fomit-frame-pointer2017020420170105
1651560refgcc -Os -fomit-frame-pointer2017020420170105
3101028optimized_nonSSEgcc -funroll-loops2017020420170105
3104344optimized_nonSSEgcc2017020420170105
3737996refgcc -funroll-loops2017020420170105
3757232refgcc2017020420170105

Test failure

Implementation: crypto_aead/pi16cipher128v1/optimized_nonSSE
Compiler: cc
error 111
crypto_aead_decrypt returns nonzero

Number of similar (compiler,implementation) pairs: 14, namely:
CompilerImplementations
cc optimized_nonSSE ref
clang -O3 -fomit-frame-pointer -Qunused-arguments optimized_nonSSE ref
clang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments optimized_nonSSE ref
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments optimized_nonSSE ref
clang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments optimized_nonSSE ref
clang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments optimized_nonSSE ref
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments optimized_nonSSE ref

Compiler output

Implementation: crypto_aead/pi16cipher128v1/ref
Compiler: cc
encrypt.c: encrypt.c:248:42: warning: unsequenced modification and access to 'i1' [-Wunsequenced]
encrypt.c: InternalState8[i1] = InternalState8[i1++] ^ ad[b+i];
encrypt.c: ~~ ^
encrypt.c: encrypt.c:374:68: warning: unsequenced modification and access to 'i1' [-Wunsequenced]
encrypt.c: c[CRYPTO_NSECBYTES+b+i] = InternalState8[i1] = InternalState8[i1++] ^ m[b+i];
encrypt.c: ~~ ^
encrypt.c: encrypt.c:536:42: warning: unsequenced modification and access to 'i1' [-Wunsequenced]
encrypt.c: InternalState8[i1] = InternalState8[i1++] ^ ad[b+i];
encrypt.c: ~~ ^
encrypt.c: 3 warnings generated.

Number of similar (compiler,implementation) pairs: 7, namely:
CompilerImplementations
cc ref
clang -O3 -fomit-frame-pointer -Qunused-arguments ref
clang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments ref
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments ref
clang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments ref
clang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments ref
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments ref

Compiler output

Implementation: crypto_aead/pi16cipher128v1/optimized_nonSSE
Compiler: cc
encrypt.c: encrypt.c:362:42: warning: unsequenced modification and access to 'i1' [-Wunsequenced]
encrypt.c: InternalState8[i1] = InternalState8[i1++] ^ ad[b+i];
encrypt.c: ~~ ^
encrypt.c: encrypt.c:488:68: warning: unsequenced modification and access to 'i1' [-Wunsequenced]
encrypt.c: c[CRYPTO_NSECBYTES+b+i] = InternalState8[i1] = InternalState8[i1++] ^ m[b+i];
encrypt.c: ~~ ^
encrypt.c: encrypt.c:650:42: warning: unsequenced modification and access to 'i1' [-Wunsequenced]
encrypt.c: InternalState8[i1] = InternalState8[i1++] ^ ad[b+i];
encrypt.c: ~~ ^
encrypt.c: 3 warnings generated.

Number of similar (compiler,implementation) pairs: 7, namely:
CompilerImplementations
cc optimized_nonSSE
clang -O3 -fomit-frame-pointer -Qunused-arguments optimized_nonSSE
clang -O3 -fwrapv -march=native -fomit-frame-pointer -Qunused-arguments optimized_nonSSE
clang -march=native -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments optimized_nonSSE
clang -mcpu=cortex-a8 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments optimized_nonSSE
clang -mcpu=cortex-a9 -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments optimized_nonSSE
clang -mcpu=native -mfpu=neon -O3 -fomit-frame-pointer -fwrapv -Qunused-arguments optimized_nonSSE