| Time | Implementation | Compiler | Benchmark date | SUPERCOP version |
| 357752 | optimized_nonSSE | gcc -mcpu=marvell-pj4 -O3 | 20161216 | 20161026 |
| 369017 | optimized_nonSSE | gcc -funroll-loops -mcpu=marvell-pj4 -O3 | 20161216 | 20161026 |
| 388462 | optimized_nonSSE | gcc -funroll-loops -mcpu=marvell-pj4 -O2 | 20161216 | 20161026 |
| 479904 | ref | gcc -mcpu=marvell-pj4 -O3 | 20161216 | 20161026 |
| 492274 | ref | gcc -funroll-loops -mcpu=marvell-pj4 -O3 | 20161216 | 20161026 |
| 514782 | optimized_nonSSE | gcc -mcpu=marvell-pj4 -O2 | 20161216 | 20161026 |
| 532525 | ref | gcc -funroll-loops -mcpu=marvell-pj4 -O2 | 20161216 | 20161026 |
| 579216 | optimized_nonSSE | gcc -mcpu=marvell-pj4 -Os | 20161216 | 20161026 |
| 587204 | optimized_nonSSE | gcc -funroll-loops -mcpu=marvell-pj4 -Os | 20161216 | 20161026 |
| 670362 | ref | gcc -mcpu=marvell-pj4 -O2 | 20161216 | 20161026 |
| 808366 | ref | gcc -mcpu=marvell-pj4 -Os | 20161216 | 20161026 |
| 852455 | ref | gcc -funroll-loops -mcpu=marvell-pj4 -Os | 20161216 | 20161026 |