| Time | Implementation | Compiler | Benchmark date | SUPERCOP version |
| 876582 | optimized_nonSSE | gcc -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161223 | 20161026 |
| 889207 | optimized_nonSSE | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161223 | 20161026 |
| 910298 | optimized_nonSSE | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161223 | 20161026 |
| 1423999 | optimized_nonSSE | gcc -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161223 | 20161026 |
| 1575604 | optimized_nonSSE | gcc -mcpu=native -mfpu=neon-vfpv4 -Os | 20161223 | 20161026 |
| 1584648 | optimized_nonSSE | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os | 20161223 | 20161026 |
| 1600282 | ref | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161223 | 20161026 |
| 1609899 | ref | gcc -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161223 | 20161026 |
| 1737697 | ref | gcc -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161223 | 20161026 |
| 1810292 | ref | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161223 | 20161026 |
| 1911137 | ref | gcc -mcpu=native -mfpu=neon-vfpv4 -Os | 20161223 | 20161026 |
| 1986952 | ref | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os | 20161223 | 20161026 |