| Time | Implementation | Compiler | Benchmark date | SUPERCOP version |
| 1316458 | neon | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 1338258 | neon | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 1375245 | neon | gcc -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 1375330 | neon | gcc -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 1384186 | neon | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 1560837 | neon | gcc -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 1716059 | ref | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 1920938 | ref | gcc -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 2253349 | ref | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 2459839 | ref | gcc -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 2708648 | ref | gcc -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 3151293 | ref | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 3537222 | smaller | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 3541557 | smaller | gcc -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 3893477 | smaller | gcc -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 3945235 | smaller | gcc -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 3986138 | smaller | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 4043383 | smaller | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 6283430 | bitslice | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 6488080 | bitslice | gcc -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 7923317 | bitslice | gcc -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 8011919 | bitslice | gcc -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 8418825 | bitslice | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 9206842 | bitslice | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 11725588 | 8bit | gcc -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 11757515 | 8bit | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O3 | 20161211 | 20161026 |
| 11978035 | 8bit | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 12640291 | 8bit | gcc -mcpu=native -mfpu=neon-vfpv4 -O2 | 20161211 | 20161026 |
| 12983539 | 8bit | gcc -funroll-loops -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |
| 14694499 | 8bit | gcc -mcpu=native -mfpu=neon-vfpv4 -Os | 20161211 | 20161026 |