| Time | Implementation | Compiler | Benchmark date | SUPERCOP version |
| 42066 | faster | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161215 | 20161026 |
| 42504 | faster | gcc -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161215 | 20161026 |
| 42808 | faster | gcc -mcpu=native -mfpu=neon-vfpv4 -Os | 20161215 | 20161026 |
| 43017 | faster | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os | 20161215 | 20161026 |
| 45335 | faster | gcc -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161215 | 20161026 |
| 45801 | faster | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161215 | 20161026 |
| 96390 | ref | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161215 | 20161026 |
| 99176 | ref | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161215 | 20161026 |
| 100067 | ref | gcc -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161215 | 20161026 |
| 124371 | ref | gcc -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161215 | 20161026 |
| 130580 | ref | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os | 20161215 | 20161026 |
| 132445 | ref | gcc -mcpu=native -mfpu=neon-vfpv4 -Os | 20161215 | 20161026 |