| Time | Implementation | Compiler | Benchmark date | SUPERCOP version |
| 529461 | optimized_nonSSE | gcc -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 589407 | optimized_nonSSE | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 598517 | optimized_nonSSE | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 720160 | optimized_nonSSE | gcc -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 792274 | ref | gcc -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 820175 | optimized_nonSSE | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 844100 | ref | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 894461 | ref | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 912961 | optimized_nonSSE | gcc -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 1011378 | ref | gcc -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 1048371 | ref | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 1060802 | ref | gcc -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |