| Time | Implementation | Compiler | Benchmark date | SUPERCOP version |
| 4901263 | neon | gcc -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 5045070 | neon | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 5315105 | neon | gcc -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 5326462 | neon | gcc -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 5424707 | neon | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 5679923 | neon | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 13269266 | arm32 | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 13972750 | arm32 | gcc -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 14272054 | arm32 | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 14506768 | arm32 | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 15118324 | 32 | gcc -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 15339238 | arm32 | gcc -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 15361273 | arm32 | gcc -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 16361345 | 32 | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 16963821 | 32 | gcc -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 18151467 | 32 | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 19218116 | 32 | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 21809748 | 32 | gcc -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |